Abstract

Search engines play a crucial role in our daily lives. Relevance is the core problem of a commercial search engine. It has attracted thousands of researchers from both academia and industry and has been studied for decades. Relevance in a modern search engine has gone far beyond text matching, and now involves tremendous challenges. The semantic gap between queries and URLs is the main barrier for improving base relevance. Clicks help provide hints to improve relevance, but unfortunately for most tail queries, the click information is too sparse, noisy, or missing entirely. For comprehensive relevance, the recency and location sensitivity of results is also critical.

In this paper, we give an overview of the solutions for relevance in the Yahoo search engine. We introduce three key techniques for base relevance – ranking functions, semantic matching features and query rewriting. We also describe solutions for recency sensitive relevance and location sensitive relevance. This work builds upon 20 years of existing efforts on Yahoo search, summarizes the most recent advances and provides a series of practical relevance solutions. The reported performance is based on Yahoo’s commercial search engine, where tens of billions of URLs are indexed and served by the ranking system.


Filed under: Classification


Comments