A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever. In this work, we first identify two major barriers to a robust ranker, i.e., inherent label noises caused by a well-trained retriever and non-ideal negatives sampled for a high-capable ranker. Thereby, we propose multiple retrievers as negative generators improve the ranker's robustness, where i) involving extensive out-of-distribution label noises renders the ranker against each noise distribution, and ii) diverse hard negatives from a joint distribution are relatively close to the ranker's negative distribution, leading to more challenging thus effective training. To evaluate our robust ranker (dubbed R$^2$anker), we conduct experiments in various settings on the popular passage retrieval benchmark, including BM25-reranking, full-ranking, retriever distillation, etc. The empirical results verify the new state-of-the-art effectiveness of our model.
翻译:排名员在事实上的“ 检索和再排序” 管道中发挥着不可或缺的作用,但其培训仍然落后 -- -- 学习中度负数或/和作为检索员的辅助模块。 在这项工作中,我们首先找出强势排名员的两大障碍,即由受过良好训练的检索员和非理想负数抽样为高能排名员造成的内在标签噪音。 因此,我们提议多个检索员作为负值发电机,提高排名员的稳健性,其中i) 涉及广泛的分配外标签噪音,使排位员与每种噪音的分布相对抗争,以及ii) 联合分配的不同硬值的负值相对接近排位员的负值分布,从而导致更具有挑战性的培训。为了评估我们的强势排级员(dubbbbed R$2美元),我们在不同场合对流行的通过检索基准进行实验,包括B25升级、全级、检索器蒸馏等。 实证结果证实了我们模型的新状态的有效性。