Meta-Learning for Query Conceptualization at Web Scale
Fred X. Han: University of Alberta; Di Niu: University of Alberta; Haolan Chen: Tencent; Weidong Guo: Tencent; Shengli Yan: Tencent; Bowei Long: Tencent
Concepts naturally constitute an abstraction for fine-grained entities and knowledge in the open domain. They enable search engines and recommendation systems to enhance user experience by discovering high-level abstraction of a search query and the user intent behind it. In this paper, we study the problem of query conceptualization, which is to find the most appropriate matching concepts for any given search query from a large pool of pre-defined concepts. We propose a coarse-to-fine approach to first reduce the search space for each query through a shortlisting scheme and then identify the matching concepts using pre-trained language models, which are meta-tuned to our query-concept matching task. Our shortlisting scheme involves using a GRU-based Relevant Words Generator (RWG) to first expand and complete the context of the given query and then shortlisting the candidate concepts through a scoring mechanism based on word overlaps. To accurately identify the most appropriate matching concepts for a query, even when the concepts may have zero verbatim overlaps with the query, we meta-fine-tune a BERT pairwise text-matching model under the Reptile meta-learning algorithm, which achieves zero-shot transfer learning on the conceptualization problem. Our two-stage framework can be trained with data completely derived from a search click graph, without requiring any human labelling efforts. For evaluation, we have constructed a large click graph based on more than $7$ million instances of the click history recorded in Tencent QQ browser and performed the query conceptualization task based on a large ontology with $159,148$ unique concepts. Results from a range of evaluation methods, including an offline evaluation procedure on the click graph, human evaluation, online A/B testing and case studies, have demonstrated the superiority of our approach over a number of competitive pre-trained language models and fine-tuned neural network baselines.
How can we assist you?
We'll be updating the website as information becomes available. If you have a question that requires immediate attention, please feel free to contact us. Thank you!
Please enter the word you see in the image below: