Extracting query-document relevance from the sparse, biased clickthrough log is among the most fundamental tasks in the web search system. Prior art mainly learns a relevance judgment model with semantic features of the query and document and ignores directly counterfactual relevance evaluation from the clicking log. Though the learned semantic matching models can provide relevance signals for tail queries as long as the semantic feature is available. However, such a paradigm lacks the capability to introspectively adjust the biased relevance estimation whenever it conflicts with massive implicit user feedback. The counterfactual evaluation methods, on the contrary, ensure unbiased relevance estimation with sufficient click information. However, they suffer from the sparse or even missing clicks caused by the long-tailed query distribution. In this paper, we propose to unify the counterfactual evaluating and learning approaches for unbiased relevance estimation on search queries with various popularities. Specifically, we theoretically develop a doubly robust estimator with low bias and variance, which intentionally combines the benefits of existing relevance evaluating and learning approaches. We further instantiate the proposed unbiased relevance estimation framework in Baidu search, with comprehensive practical solutions designed regarding the data pipeline for click behavior tracking and online relevance estimation with an approximated deep neural network. Finally, we present extensive empirical evaluations to verify the effectiveness of our proposed framework, finding that it is robust in practice and manages to improve online ranking performance substantially.
翻译:从稀少的、有偏向的点击记录中提取查询文件相关性,是网络搜索系统的最根本任务之一。先前的艺术主要学习一个具有查询和文件语义特征的相关判断模型,并直接忽略了点击记录中的反事实相关性评价。虽然所学的语义匹配模型可以提供尾端查询的相关性信号,只要有语义特征即可。然而,这种模式缺乏在与大量隐含用户反馈相冲突时对偏向相关性估计进行深视调整的能力。相反,相反,相反,反事实评估方法确保了无偏见的相关性估计,并提供了足够的点击信息。然而,由于长尾查询分布导致点击次数稀少,甚至缺少点击次数。在本文件中,我们提议统一反事实评估和学习方法,以便在有各种流行特征的搜索查询中进行公正的相关估计。具体地说,我们理论上没有能力对偏差和差异低的估算进行精确度的预测,有意将现有相关性评估和学习方法的效益结合起来。我们进一步概括了在Baidu搜索中拟议的公正相关性估计框架,并设计了全面的实际解决办法。在本网络中,我们最后将改进了对广泛的业绩评估的准确性评估,然后再对结果进行深入的在线评估。