Pre-trained deep language models~(LM) have advanced the state-of-the-art of text retrieval. Rerankers fine-tuned from deep LM estimates candidate relevance based on rich contextualized matching signals. Meanwhile, deep LMs can also be leveraged to improve search index, building retrievers with better recall. One would expect a straightforward combination of both in a pipeline to have additive performance gain. In this paper, we discover otherwise and that popular reranker cannot fully exploit the improved retrieval result. We, therefore, propose a Localized Contrastive Estimation (LCE) for training rerankers and demonstrate it significantly improves deep two-stage models.
翻译:预先培训的深语言模型~(LM)已经提高了文字检索的最新水平。根据深 LM根据丰富的背景匹配信号对候选人相关性进行了精确的估算,对深 LM进行了微调。与此同时,深 LM也可以被利用来改进搜索指数,建立记忆较好的检索器。人们期望在管道中将两者直接结合,从而获得添加性能的收益。在本文中,我们发现其他情况,而普通的重新排序者无法充分利用改进的检索结果。因此,我们提议为重新排序者培训一个本地化对比估计(LCE ), 并表明它大大改进了深层的两阶段模型。