Current dense text retrieval models face two typical challenges. First, they adopt a siamese dual-encoder architecture to encode queries and documents independently for fast indexing and searching, while neglecting the finer-grained term-wise interactions. This results in a sub-optimal recall performance. Second, their model training highly relies on a negative sampling technique to build up the negative documents in their contrastive losses. To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker. The two models are jointly optimized according to a minimax adversarial objective: the retriever learns to retrieve negative documents to cheat the ranker, while the ranker learns to rank a collection of candidates including both the ground-truth and the retrieved ones, as well as providing progressive direct feedback to the dual-encoder retriever. Through this adversarial game, the retriever gradually produces harder negative documents to train a better ranker, whereas the cross-encoder ranker provides progressive feedback to improve retriever. We evaluate AR2 on three benchmarks. Experimental results show that AR2 consistently and significantly outperforms existing dense retriever methods and achieves new state-of-the-art results on all of them. This includes the improvements on Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and MS-MARCO MRR@10 to 39.5%(+1.3%). Code and models are available at https://github.com/microsoft/AR2.
翻译:当前密集的文本检索模型面临两个典型的挑战。 首先,它们采用了一个双编码结构,为快速索引和搜索独立编码查询和文件,而忽略了精细加分的术语互动。 这导致一个亚优的回溯性表现。 其次,它们的模型培训高度依赖一种负面抽样技术来积累其对比性损失中的负面文档。 为了应对这些挑战,我们介绍了Aversarial Retriever-Ranker (AR2), 其中包括一个双编码软体读取器和一个跨编码的排序器。 两种模型按照一个小型数学对抗性目标联合优化了 : 检索者学会检索负面文件来欺骗排名者, 而排名者学会将包括地面图和回收者在内的一批候选人排在排名中排名。 为了应对这些挑战,我们通过这个对抗性游戏,检索者逐渐生成了更难的文档,而交叉编码的排名器 Q Q 提供了不断进步的反馈, 不断改进的AR2+% 。 我们评估了当前 R2 的 RR2 最新结果。