Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At the same time, deep language models have been shown to outperform traditional bag-of-words rerankers. However, it is unclear how to integrate PRF directly with emergent deep language models. In this article, we address this gap by investigating methods for integrating PRF signals into rerankers and dense retrievers based on deep language models. We consider text-based and vector-based PRF approaches, and investigate different ways of combining and scoring relevance signals. An extensive empirical evaluation was conducted across four different datasets and two task settings (retrieval and ranking). Text-based PRF results show that the use of PRF had a mixed effect on deep rerankers across different datasets. We found that the best effectiveness was achieved when (i) directly concatenating each PRF passage with the query, searching with the new set of queries, and then aggregating the scores; (ii) using Borda to aggregate scores from PRF runs. Vector-based PRF results show that the use of PRF enhanced the effectiveness of deep rerankers and dense retrievers over several evaluation metrics. We found that higher effectiveness was achieved when (i) the query retains either the majority or the same weight within the PRF mechanism, and (ii) a shallower PRF signal (i.e., a smaller number of top-ranked passages) was employed, rather than a deeper signal. Our vector-based PRF method is computationally efficient; thus this represents a general PRF method others can use with deep rerankers and dense retrievers.
翻译:已知的Peedo 相关性反馈(PRF)旨在提高字包检索器的实效。 同时,深语言模型显示优于传统的词包重置器。 然而,尚不清楚如何直接将PRF与突发的深语言模型整合为一体。 在本条中,我们通过调查将PRF信号与基于深层语言模型的重新整理器和密集检索器相结合的方法来解决这一差距。我们考虑了基于文本和基于矢量的PRF方法,并调查合并和评分相关信号的不同方式。在四个不同的数据集和两个任务设置(检索和排名)之间进行了广泛的实证评价。基于文本的PRF结果显示,使用PRF对不同数据集的深度重新排序器产生了好坏参半的影响。我们发现,当(一)直接将每个PRF的信号输入器与查询器连接在一起,然后对分数进行汇总(二)使用Borda到来自PRF的总量分数。基于 VRF的多数评估结果显示,使用PRF的更低的升级方法可以提高我们内部的精度。