Search by Algolia
Add InstantSearch and Autocomplete to your search experience in just 5 minutes
product

Add InstantSearch and Autocomplete to your search experience in just 5 minutes

A good starting point for building a comprehensive search experience is a straightforward app template. When crafting your application’s ...

Imogen Lovera

Senior Product Manager

Best practices of conversion-focused ecommerce website design
e-commerce

Best practices of conversion-focused ecommerce website design

The inviting ecommerce website template that balances bright colors with plenty of white space. The stylized fonts for the headers ...

Catherine Dee

Search and Discovery writer

Ecommerce product listing pages: what they are and how to optimize them for maximum conversion
e-commerce

Ecommerce product listing pages: what they are and how to optimize them for maximum conversion

Imagine an online shopping experience designed to reflect your unique consumer needs and preferences — a digital world shaped completely around ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

DevBit Recap: Winter 2023 — Community
engineering

DevBit Recap: Winter 2023 — Community

Winter is here for those in the northern hemisphere, with thoughts drifting toward cozy blankets and mulled wine. But before ...

Chuck Meyer

Sr. Developer Relations Engineer

How to create the highest-converting product detail pages (PDPs)
e-commerce

How to create the highest-converting product detail pages (PDPs)

What if there were a way to persuade shoppers who find your ecommerce site, ultimately making it to a product ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

Highlights from GopherCon Australia 2023
engineering

Highlights from GopherCon Australia 2023

This year a bunch of our engineers from our Sydney office attended GopherCon AU at University of Technology, Sydney, in ...

David Howden
James Kozianski

David Howden &

James Kozianski

Enhancing customer engagement: The role of conversational commerce
e-commerce

Enhancing customer engagement: The role of conversational commerce

Second only to personalization, conversational commerce has been a hot topic of conversation (pun intended) amongst retailers for the better ...

Michael Klein

Principal, Klein4Retail

Craft a unique discovery experience with AI-powered recommendations
product

Craft a unique discovery experience with AI-powered recommendations

Algolia’s Recommend complements site search and discovery. As customers browse or search your site, dynamic recommendations encourage customers to ...

Maria Lungu

Frontend Engineer

What are product detail pages and why are they critical for ecommerce success?
e-commerce

What are product detail pages and why are they critical for ecommerce success?

Winter is coming, along with a bunch of houseguests. You want to replace your battered old sofa — after all,  the ...

Catherine Dee

Search and Discovery writer

Why weights are often counterproductive in ranking
engineering

Why weights are often counterproductive in ranking

Search is a very complex problem Search is a complex problem that is hard to customize to a particular use ...

Julien Lemoine

Co-founder & former CTO at Algolia

How to increase your ecommerce conversion rate in 2024
e-commerce

How to increase your ecommerce conversion rate in 2024

2%. That’s the average conversion rate for an online store. Unless you’re performing at Amazon’s promoted products ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

How does a vector database work? A quick tutorial
ai

How does a vector database work? A quick tutorial

What’s a vector database? And how different is it than a regular-old traditional relational database? If you’re ...

Catherine Dee

Search and Discovery writer

Removing outliers for A/B search tests
engineering

Removing outliers for A/B search tests

How do you measure the success of a new feature? How do you test the impact? There are different ways ...

Christopher Hawke

Senior Software Engineer

Easily integrate Algolia into native apps with FlutterFlow
engineering

Easily integrate Algolia into native apps with FlutterFlow

Algolia's advanced search capabilities pair seamlessly with iOS or Android Apps when using FlutterFlow. App development and search design ...

Chuck Meyer

Sr. Developer Relations Engineer

Algolia's search propels 1,000s of retailers to Black Friday success
e-commerce

Algolia's search propels 1,000s of retailers to Black Friday success

In the midst of the Black Friday shopping frenzy, Algolia soared to new heights, setting new records and delivering an ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

Generative AI’s impact on the ecommerce industry
ai

Generative AI’s impact on the ecommerce industry

When was your last online shopping trip, and how did it go? For consumers, it’s becoming arguably tougher to ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

What’s the average ecommerce conversion rate and how does yours compare?
e-commerce

What’s the average ecommerce conversion rate and how does yours compare?

Have you put your blood, sweat, and tears into perfecting your online store, only to see your conversion rates stuck ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

What are AI chatbots, how do they work, and how have they impacted ecommerce?
ai

What are AI chatbots, how do they work, and how have they impacted ecommerce?

“Hello, how can I help you today?”  This has to be the most tired, but nevertheless tried-and-true ...

Catherine Dee

Search and Discovery writer

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

How do you measure the success of a new feature? How do you test the impact? There are different ways of doing this including feature flags. Many people have used A/B testing solutions, too, like Optimizely or other third party systems to test the impact of their features.

When it comes to Algolia, our customers want to understand how a new feature might impact conversion and click-through rates, or boost revenue. Within our product dashboard is an A/B testing feature that customers can use to run their tests. Customers will start an A/B test with a hypothesis — perhaps some change like dynamic reranking or AI search might increase the click-through rate or conversion rate over and above their control.

AB testing search

The dashboard shows click through rate (CTR) and conversion rate (CVR). As the test progresses, we measure things like uplift and significance, too. In this example below, we’re measuring results from a personalization test between two different variants. These tests are live on a site, and clicks and conversions can sometimes be tainted with inaccurate data.

AB test example

One thing you may notice on this screen above is that there is a difference in Tracked Searches. Quite a big difference. While a 50/50 traffic split is what the customer has configured, the distribution of them is actually quite large. It’s an almost 3% difference which equates to 20,000 more tracked searches in one test versus another. Ideally it should be a little bit better than this, a little bit closer to a true 50/50 split.

Why the difference? Bots are basically the answer. Unless a customer is explicitly disabling click analytics, you could get these bots doing thousands and thousands of searches. It throws off click-through and conversion statistics, which in turn affects results. Of course, it’s not just a nuisance; it has real ramifications for our customers who are attempting to optimize their businesses. This activity interferes with getting clean results.

We wanted to find a way to remove these outliers automatically to give customers more accurate results. In this blog, I will explain how we set out to tackle this problem.

By the numbers

Here’s a simple example. Let’s say you have one A/B test variant with 10,000 searches and 1500 conversions, it’s a 15% conversion rate. Just a simple division. A second variant has 10,900 searches with 1600 conversions, or a 14.6% conversion rate.

search outliers

If the second variant was caused by bot traffic, you get a bad uplift. It went from 15% conversion to 14.6%. Our customers will see this as a bad trend. If we fix the problem, we might discover that 1000 searches were false positives. When we remove those, suddenly the real results are able to come through.

outliers removed

In fact, the test had a great outcome with a 16% conversion rate. This is a very skewed example, but it makes the point. It can also go in the other direction: sometimes when we remove bot traffic, it could lead to a worse uplift, but it’s a more honest approximation. It’s an objectively more accurate uplift.

What is an outlier?

what is an outlier

With the new setup, we were able to more quickly and accurately label data as an outlier. So what is an outlier? If you look at a normal log distribution for something that might look like the graph above. This graph is hypothetical — in most sites, the data is quite different and includes a much deeper longtail. However, for this hypothetical example, you’ll see that most people will do between two and five searches on a website. They’re not going to be doing thousands of searches. As you count searches across thousands of sessions, you’ll find outliers on the far right — people who are doing a really high number of searches. And so the question is: how do you quantify that? Where do you set the cutoff point between legit searches and what appear to be fake data?

At Algolia, we’ve set a threshold of at least 100 searches, and then they need to have greater than seven standard deviations away from the mean number of searches per user, which is a ridiculous number. In other words, there has to be a crazy number of searches outside of the mean for an application in order to qualify for this.

The flat 100 searches is to exclude these very unique data sets where you have one single search being 99% of your searches. It helps ensure that people who do 2 or 3 searches are included, but anybody that’s doing greater than 99.99999 is a good candidate for being labeled an outlier.

what is an outlier
The graph above is called a log-normal distribution. To get the mean and standard deviation, we take the natural log of each of this data set which gives us something that looks more like a normal distribution’s bell curve. At this point, we can then take normal mean and standard deviation. We transform our data into something like this:

log frequency
We can then classify all those users with a threshold and label them as outliers. To develop this method, one of our engineers, Hugo Rybinski, investigated different algorithms for detecting outliers. He looked at things like isolation forests and generating data, and decided that it’s much easier and faster to simply label people with too many searches as outliers. Another teammate, Raymond Rutjes, also did a lot of the groundwork to help us arrive at this solution today. He had started implementing outlier removal. So, a big shout out to both of them!

A/B testing set up

To exclude outliers, we added a global configuration to each test that gets created whereby outliers are excluded by default. We aggregate all of the data and exclude outliers from calculations.

And the result looks something like this:

outliers removed outlier removal

We can return results to customers with the excluded outliers. In the example above, it’s an A/B test with a 75/25 split. One cohort had 29,000 tracked searches removed, which represented 26 outlier users. The more accurate data helps customers to be able to interpret the A/B test and make informed decisions and hopefully increases their trust in the data.

What’s next for A/B testing

I hope you enjoyed this peek inside Algolia engineering and how we have approached this problem. A/B testing is a core feature of the product that enables customers to improve the search algorithm and take advantage of different features, so have continue to focus on developing improvements for it. In fact, we have even more coming soon such as

  • Support for additional feature-specific A/B testing.
  • Syntax tree representation of filters — dynamic query structuring to operate on certain types of searches.
  • Sample size estimates to help customers know exactly how long to run their tests for.

… and way more! You can learn more about running A/B testing in our documentation.

About the author
Christopher Hawke

Senior Software Engineer

linkedin

Recommended Articles

Powered byAlgolia Algolia Recommend

How to improve site search with Algolia A/B testing
e-commerce

Loise Mercier

5 ways to A/B test your search for relevance
ux

Louise Vollaire

Product Marketing Manager

Evaluating the best AB testing metrics for search
engineering

Nicolas Fiorini

Senior Machine Learning Engineer