Research and Insights

By CrowdFlower Admin, July 5, 2016

Optimizing Your Search Engine: Metrics Data Scientists Should Use to Grade and Test Search Algorithms

Your search relevance algorithm is only as good as the data you feed into it. Better algorithms lead to positive user engagement, higher conversions, increased sales and ultimately — better customer experience. On July 19 CrowdFlower will be hosting a webinar, How to Improve Search Relevance, Business Outcomes and User Experience, featuring customers from Adobe and Etsy’s data science teams. Today, we’ll give you a preview of some of the measurement tips our guests will discuss.

One of the key points our guest speakers have shared with us is that there is no magic bullet for optimizing your search algorithm. You have to give weight to different metrics depending on your desired outcome. With that being said, one thing our guests both agree on is that in order to get the best results for your algorithm, you must combine click data and human curated data.

Here are the most common metrics that inform a search relevance algorithm.

Search engine relevance optimization webinar to improve search algorithms

Click Data

  • Click Through Rate Per Query:  A query-click through rate (CTR) is the number of clicks that your search result receives divided by the number of times your result is shown. (Clicks ÷ Impressions = CTR). Typically, the query-CTR is one of the most discriminating features for predicting whether a user will click on a search results or not.
  • Conversions Per Query: Similar in formula to a query-CTR, the conversions per query metric gives you a percentage readout of how many users were driven to a conversion action (e.g. a purchase, a download, a sign-up, etc.) based on your result. (Conversions ÷ Impressions = Conversion Per Query).

Human Curated Data

The above metrics only give you a nearsighted view of your actual search quality. In order to answer deeper questions like “How good is my overall relevance?” and “Where do I stand compared to my competitors?” you need to bring in human measurement to evaluate your search quality in a more objective manner.

  • Per Result Relevance: By leveraging Contributors, you can measure intent by comparing search queries to top results. In order to run this task you pull a random amount of search queries from your logs and then, pull some the top results that appear for each query. Then you take that set of pairings and score them on a relevance scale of your choosing.
  • Whole Page Relevance: Often used in tandem with Result Relevance, the major advantage of Whole Page Relevance is that it accurately portrays what the search experience is on your site. If that entire page returns quality matches, you’re returning a great experience. Much like per result relevance, this is a human-scored metric so you’ll want to settle on a scoring system.

Putting It All Together with DCG Scoring

Now that you have click data and human curated data, now what? That’s where Discounted Cumulative Gain (DCG) comes in. The most common scoring technique is DCG. DCG aims to get you a single number that explains how accurate results are for a certain query by combining the click data position and human-scored relevance number for each query-result pairing so you’re left with a single, easy to understand metric.

Conclusion

A human relevance measurement system is fundamental to building a great, long-lasting search engine. To learn more, make sure to sign up for our webinar on July 19. Along with getting your hands on some real-life search relevance examples from the data science leaders of CrowdFlower, Adobe, and Etsy, CrowdFlower will also be sharing a free Excel-based DCG calculator after the presentation.