Multi-vector retrieval models such as ColBERT [Khattab and Zaharia, 2020] allow token-level interactions between queries and documents, and hence achieve state of the art on many information retrieval benchmarks. However, their non-linear scoring function cannot be scaled to millions of documents, necessitating a three-stage process for inference: retrieving initial candidates via token retrieval, accessing all token vectors, and scoring the initial candidate documents. The non-linear scoring function is applied over all token vectors of each candidate document, making the inference process complicated and slow. In this paper, we aim to simplify the multi-vector retrieval by rethinking the role of token retrieval. We present XTR, ConteXtualized Token Retriever, which introduces a simple, yet novel, objective function that encourages the model to retrieve the most important document tokens first. The improvement to token retrieval allows XTR to rank candidates only using the retrieved tokens rather than all tokens in the document, and enables a newly designed scoring stage that is two-to-three orders of magnitude cheaper than that of ColBERT. On the popular BEIR benchmark, XTR advances the state-of-the-art by 2.8 nDCG@10 without any distillation. Detailed analysis confirms our decision to revisit the token retrieval stage, as XTR demonstrates much better recall of the token retrieval stage compared to ColBERT.
翻译:多向量检索模型 ColBERT [Khattab and Zaharia, 2020] 等允许查询和文档之间进行标记级交互,因此在许多信息检索基准上实现了最先进的结果。然而,它们的非线性评分函数无法扩展到数百万个文档,需要一个三阶段的推理过程:通过标记检索检索初始候选项、访问所有标记向量,并对初始候选文档进行评分。非线性评分函数应用于每个候选文档的所有标记向量,使推理过程复杂而缓慢。在本文中,我们旨在通过重新思考 Token 检索的作用来简化多向量检索。我们提出了 XTR,即 ConteXtualized Token Retriever,它引入了一个简单而新颖的目标函数,鼓励模型先检索最重要的文档标记。Token 检索的改进使得 XTR 只使用检索到的标记对候选进行排名,而不是文档中的所有标记,并且启用了一种新设计的评分阶段,其成本比 ColBERT 的评分阶段低两到三个数量级。在流行的 BEIR 基准测试中,不需要任何蒸馏,XTR 将最先进的结果提高了 2.8 nDCG@10。详细分析确认了我们重新审视 Token 检索阶段的决定,因为 XTR 显示了比 ColBERT 更好的 Token 检索阶段的召回率。