Recent progress in information retrieval finds that embedding query and document representation into multi-vector yields a robust bi-encoder retriever on out-of-distribution datasets. In this paper, we explore whether late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [CLS] vector to compute the similarity score. Although intuitively, the attention mechanism of rerankers at the previous layers already gathers the token-level information, we find adding late interaction still brings an extra 5% improvement in average on out-of-distribution datasets, with little increase in latency and no degradation in in-domain effectiveness. Through extensive experiments and analysis, we show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures and that the improvement is more prominent on longer queries.
翻译:信息检索方面的最新进展发现,将查询和文件代表方式嵌入多矢量中,会在分配外数据集上产生一个强大的双编码检索器。 在本文中,我们探索延迟互动(多矢量数据的最简单形式)是否也有助于神经再置器,而神经再置器仅使用 [CLS] 矢量来计算相似的分数。 尽管直观地看,前层重新排序者的注意机制已经收集了象征性信息,但我们发现,增加延迟互动仍然使分配外数据集的平均改善率增加5%,延缓度没有增加,在内部效果方面没有退化。 通过广泛的实验和分析,我们显示,发现的结果在不同的模型大小和不同性质的第一阶段检索者之间是一致的,而且改进在更长的查询中更为突出。