一个简单而有效的积极学习排名框架 (A Simple yet Effective Framework for Active Learning to Rank)

from arxiv, This paper is accepted to Machine Intelligence Research and a short version is presented in NeurIPS 2022 Workshop on Human in the Loop Learning

While China has become the biggest online market in the world with around 1 billion internet users, Baidu runs the world largest Chinese search engine serving more than hundreds of millions of daily active users and responding billions queries per day. To handle the diverse query requests from users at web-scale, Baidu has done tremendous efforts in understanding users' queries, retrieve relevant contents from a pool of trillions of webpages, and rank the most relevant webpages on the top of results. Among these components used in Baidu search, learning to rank (LTR) plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR models. To reduce the costs and time consumption of queries/webpages labeling, we study the problem of Activ Learning to Rank (active LTR) that selects unlabeled queries for annotation and training in this work. Specifically, we first investigate the criterion -- Ranking Entropy (RE) characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints, using a Query-By-Committee (QBC) method. Then, we explore a new criterion namely Prediction Variances (PV) that measures the variance of prediction results for all relevant webpages under a query. Our empirical studies find that RE may favor low-frequency queries from the pool for labeling while PV prioritizing high-frequency queries more. Finally, we combine these two complementary criteria as the sample selection strategies for active learning. Extensive experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models achieving higher Discounted Cumulative Gain (i.e., the relative improvement {\Delta}DCG4=1.38%) with the same budgeted labeling efforts.

翻译：虽然中国已成为世界上最大的在线市场,拥有约10亿互联网用户,但白度却拥有世界上最大的中国搜索引擎,每天为数亿以上的日常活跃用户服务,并每天回答数十亿次查询。为了处理网络规模用户提出的不同查询请求,白度做了巨大努力,从数万亿网页中检索相关内容,并将最相关的网页排在成果的顶端。在白度搜索中使用的这些组件中,学习排名(LTR)发挥着关键作用,我们需要及时将数量极多的查询点与相关网页网页一同贴上,以培训和更新在线的G4 基线模型。为了降低查询/网页标签的成本和时间消耗,我们研究了“Activic Learnal Learning to Rance”的问题,从数万亿次的网页搜索中检索相关内容,我们首先调查标准 -- -- 排名Entropy(RE) 将相关的REpy 标定相关网页的精度分为一系列LTRTER模型,由不同的检查点更新的在线数据库模型,用QQreal-BCal real real real recal comal real ex real comst report laction lactional laction lactional legal as mactal exal exports,然后我们可能显示新的Cal destalal exal 。