Neural ranking models (NRMs) have demonstrated effective performance in several information retrieval (IR) tasks. However, training NRMs often requires large-scale training data, which is difficult and expensive to obtain. To address this issue, one can train NRMs via weak supervision, where a large dataset is automatically generated using an existing ranking model (called the weak labeler) for training NRMs. Weakly supervised NRMs can generalize from the observed data and significantly outperform the weak labeler. This paper generalizes this idea through an iterative re-labeling process, demonstrating that weakly supervised models can iteratively play the role of weak labeler and significantly improve ranking performance without using manually labeled data. The proposed Generalized Weak Supervision (GWS) solution is generic and orthogonal to the ranking model architecture. This paper offers four implementations of GWS: self-labeling, cross-labeling, joint cross- and self-labeling, and greedy multi-labeling. GWS also benefits from a query importance weighting mechanism based on query performance prediction methods to reduce noise in the generated training data. We further draw a theoretical connection between self-labeling and Expectation-Maximization. Our experiments on two passage retrieval benchmarks suggest that all implementations of GWS lead to substantial improvements compared to weak supervision in all cases.
翻译:神经排名模型(NRMs)在多个信息检索(IR)任务中都表现出有效性能。但是,训练NRMs通常需要大量的训练数据,这很难且昂贵获取。为解决这个问题,可以通过弱监督训练NRMs,其中使用现有排名模型(称为弱标签机)自动生成大型数据集来训练NRMs。弱监督的NRMs可以从观察到的数据中推广,并且在性能方面显著优于弱标签机。本文通过迭代重新标记过程推广了这个想法,证明了在不使用人工标记数据的情况下,弱监督模型可以迭代地扮演弱标签机的角色,从而显著改善排序性能。提出的通用弱监督(GWS)解决方案是通用的,不受排名模型体系结构的限制。本文提供了四种GWS实现:自标记、交叉标记、联合交叉和自标记、以及贪婪多标记。GWS还从基于查询性能预测方法的查询重要性加权机制中受益,以减少生成的训练数据中的噪声。我们进一步在自标记和期望最大化之间建立了理论联系。我们在两个段落检索基准测试上的实验证明,与弱监督相比,所有GWS实现在所有情况下都可以显著提高性能。