Add InstantSearch and Autocomplete to your search experience in just 5 minutes
A good starting point for building a comprehensive search experience is a straightforward app template. When crafting your application’s ...
Senior Product Manager
A good starting point for building a comprehensive search experience is a straightforward app template. When crafting your application’s ...
Senior Product Manager
The inviting ecommerce website template that balances bright colors with plenty of white space. The stylized fonts for the headers ...
Search and Discovery writer
Imagine an online shopping experience designed to reflect your unique consumer needs and preferences — a digital world shaped completely around ...
Senior Digital Marketing Manager, SEO
Winter is here for those in the northern hemisphere, with thoughts drifting toward cozy blankets and mulled wine. But before ...
Sr. Developer Relations Engineer
What if there were a way to persuade shoppers who find your ecommerce site, ultimately making it to a product ...
Senior Digital Marketing Manager, SEO
This year a bunch of our engineers from our Sydney office attended GopherCon AU at University of Technology, Sydney, in ...
David Howden &
James Kozianski
Second only to personalization, conversational commerce has been a hot topic of conversation (pun intended) amongst retailers for the better ...
Principal, Klein4Retail
Algolia’s Recommend complements site search and discovery. As customers browse or search your site, dynamic recommendations encourage customers to ...
Frontend Engineer
Winter is coming, along with a bunch of houseguests. You want to replace your battered old sofa — after all, the ...
Search and Discovery writer
Search is a very complex problem Search is a complex problem that is hard to customize to a particular use ...
Co-founder & former CTO at Algolia
2%. That’s the average conversion rate for an online store. Unless you’re performing at Amazon’s promoted products ...
Senior Digital Marketing Manager, SEO
What’s a vector database? And how different is it than a regular-old traditional relational database? If you’re ...
Search and Discovery writer
How do you measure the success of a new feature? How do you test the impact? There are different ways ...
Senior Software Engineer
Algolia's advanced search capabilities pair seamlessly with iOS or Android Apps when using FlutterFlow. App development and search design ...
Sr. Developer Relations Engineer
In the midst of the Black Friday shopping frenzy, Algolia soared to new heights, setting new records and delivering an ...
Chief Executive Officer and Board Member at Algolia
When was your last online shopping trip, and how did it go? For consumers, it’s becoming arguably tougher to ...
Senior Digital Marketing Manager, SEO
Have you put your blood, sweat, and tears into perfecting your online store, only to see your conversion rates stuck ...
Senior Digital Marketing Manager, SEO
“Hello, how can I help you today?” This has to be the most tired, but nevertheless tried-and-true ...
Search and Discovery writer
As a hosted-search engine service, we discuss the relevance aspect of search with our customers and prospects all day long. We now have more than 1500 customers and have seen a large variety of real-life search problems. It’s interesting to note that more often than not, these problems are in some way connected to the R word. Relevance.
Relevance is a well understood concept in search engines, but is pretty complex to measure, as a high degree of subjectivity is implicit in the notion of relevance. In fact, we’ve seen time and again that too many people spend too much time trying to control their relevance.
In this post, we’ll share our top 10 tips to help people achieve good relevance in their search results.
It might seem like we’re stating the obvious, but the first step to having good relevance is structuring your data correctly. Having a long string that contains all your information concatenated won’t put you on the path to good relevance.
You should have an object structured with different attributes and different strings to help the engine associate the importance of matches in your preferred order. Avoiding string concatenation to list values also ensures that the proximity measure will not be reflected inaccurately because the last word of one value is close to the first word of another value. (Proximity is an important element of the relevance that measures how close the query terms are in the matched document.)
Here is an example using a movie title:
{ "title": "Fast & Furious 6", "alternative_titles": [ "The Fast and the Furious 6", "速度与激情6", ... ], "genre": [ "Action", "Thriller", "Crime" ], "objectID": "440309800" }
Which works better than:
{ "movie": "Fast & Furious 6 | The Fast and the Furious 6 | 速度与激情6 | Action, Thriller, Crime" }
Before doing advance tuning, there are a lot of small checks you can do to achieve a decent level of textual relevance. For example you should be able to find a record that contains "iphone"
for the query "i-phone"
and "i phone"
or find a record containing "hi-speed usb"
with a query "hispeed usb"
. You should also check that you handle abbreviations and are able to find a record containing "U.S.A"
with the "USA"
query and vice versa.
Typos are very frequent, especially because of the small virtual keywords of mobile devices (and the famous fat-finger effect). If your record contains "iphone"
, you should be able to find it via "ipjone"
or "iphoen"
. Users also love as-you-type search experiences, so you should ensure that you are able to tolerate typos on the prefix. For example, the code should be such that the query "mikca"
is considered as a prefix of "mickael"
(because mikca = one typo of micka).
All these cases are automatically handled for you by the Algolia engine.
Relevance is not a one day job, and you will discover specific queries with relevance issues over time. You should try not to optimize your relevance for a specific query without having the big picture in mind and having a setup that allows you to efficiently test your modifications on the entire set of query logs ensuring you won’t degrade the relevance of other queries. A good non-regression test suite is mandatory.
When we talk to different search teams, we see that they are all used to configuring "boost"
and it comes as a reflex for them. By boost they mean integers that they configure in their ranking formula like “I configured a boost of 5 on my title attribute, 2 on my brand attribute and 1 on my description attribute” which is kind of code for “The attribute title is 5 times more important than description and brand is twice as important as description”.
Unfortunately these "boosts"
are not so great. Changing one from a value X to Y is the most common source of issues we see everyday! No engineer is able to predict what will happen because this boost will be combined with a lot of other factors that make the modification unpredictable (you can see it as one integer mixed with other elements in a big mathematical formula). In other terms, even if you have non-regression tests, a change of boost will just totally change your result set and it will be close to impossible to say if the new configuration is better than the previous one or not.
This is actually why we designed a different way to handle search relevance with a predictable approach.
Sometimes your search can be good but the user can perceive the results as bad because there is no explanation of the match.
The best–and most intuitive–way to explain a search result is to visually highlight the matching query terms in the result, but there are also two cases that are often not well handled:
As obvious as it seems, before trying to use a recipe you should be sure that your use case is compatible with it. This is the case with stop word removal (removing the most commonly used words in a given language like "the"
, "of"
, "to"
, "be"
, "or"
, …).
But those words can be very useful and removing them sometimes hurts the relevance. For example, if you try to search for the “To Beta or not to Beta” article on hacker news without stop words, the engine will end up with the query, Beta.
There are even worse queries, like if you want to search the artist “The The.” In this case you would just have a no-results page!
Of course, there are cases where removing stop words are useful. If you have a query in natural language or if you are trying to search for similar content. But those cases are more of an exception than the norm. Be wary of removing stop words!
There will always be some specific queries that can be complex to handle, this is the case for the TV show called “V.” This query is particularly challenging in an instant search use case:
"v"
in your data set.Another type of corner cases is the usage of symbols, this is the case if you are looking for the band “!!!.” We encounter such problems with symbols in almost every use case.
Natural languages have a lot of variety that can cause your records to not be returned. For example, if you are using a singular word in your query and your record contains the plural word. There is some language specific heuristics that help to address this problem. The most popular are:
"running"
in "run"
. The most popular open source stemmer is Snowball and is based on a set of rules per language.The major drawback of these approaches is that they only address one language. We see in practice very few cases when there are only words from one language and those techniques can produce noise on proper names such as last name or brand. You can, for example, think about a search of people on a social network, where those approaches can introduce bad results.
The first eight tips target the textual relevance, but you should also include business data in order to have good relevance. It can be just a basic metric like the number of page views or something more advanced like the number of times a product was put in a cart.
It can even be an advanced metric which relates to the query like “the number of times a product was bought when searched with a particular query”.
From our experience, the addition of business data makes a big difference if the textual relevance is good. That said, the business relevance should not bypass the textual relevance or you risk loosing all the benefits of the hard work done on relevance! Textual relevance should (almost always) go first and in case the textual relevance doesn’t help to decide whether one hit or the other should go first, then the engine should use the associated business data.
Personalization of search is the final touch to get the perfect relevance and is the part that most people don’t really see. Let’s take a simple example: if you search for “milk” on your favorite grocery store that applied all the previous tips, you will find the most popular milk bottle. But if you are a regular user of this store and have already bought a particular bottle of milk several times in the past, you’re likely to expect this one first. This is the ultimate way to make the user love the search result and avoid the perception of a bad relevance. In other words, it’s the icing on top of the cake!
We hope this list of advice will be useful to help you get a better search functionality on your website or app. This list is unfortunately not exhaustive as relevance is a pretty complex domain and there are a lot of specific problems that we do not cover in this list.
Our team is dedicated to help you have a better relevance, fill free to contact us at contact(at)algolia.com to share your problems and we will be happy to analyse them with you.
Drive conversions with advanced user experience strategies.
Powered by Algolia Recommend