Current pre-trained language model approaches to information retrieval can be broadly divided into two categories: sparse retrievers (to which belong also non-neural approaches such as bag-of-words methods, e.g., BM25) and dense retrievers. Each of these categories appears to capture different characteristics of relevance. Previous work has investigated how relevance signals from sparse retrievers could be combined with those from dense retrievers via interpolation. Such interpolation would generally lead to higher retrieval effectiveness. In this paper we consider the problem of combining the relevance signals from sparse and dense retrievers in the context of Pseudo Relevance Feedback (PRF). This context poses two key challenges: (1) When should interpolation occur: before, after, or both before and after the PRF process? (2) Which sparse representation should be considered: a zero-shot bag-of-words model (BM25), or a learnt sparse representation? To answer these questions we perform a thorough empirical evaluation considering an effective and scalable neural PRF approach (Vector-PRF), three effective dense retrievers (ANCE, TCTv2, DistillBERT), and one state-of-the-art learnt sparse retriever (uniCOIL). The empirical findings from our experiments suggest that, regardless of sparse representation and dense retriever, interpolation both before and after PRF achieves the highest effectiveness across most datasets and metrics.
翻译:目前,对信息检索的事先培训的语言模式方法可以大致分为两类:稀少的检索者(也属于非神经方法,例如BM25)和密集检索者。这些类别似乎都具有不同的关联性。以前的工作调查了如何通过内插将稀少检索者发出的信号与密集检索者发出的信号结合起来。这种内插通常会导致更高的检索效力。在本文件中,我们认为将分散和密集检索者发出的相关信号与普塞博相关性反馈(PRF)相结合的问题。这一背景提出了两个主要挑战:(1) 当在PRF进程之前、之后或之后进行内插时?(2) 应考虑哪些隐含性代表:零弹的字包模型(BM25),或经深知的偏少代表?为了回答这些问题,我们进行了彻底的经验评估,考虑的是有效和可伸缩的神经质 PRF方法(V-PRF)、三个有效的密集检索者(NC、TCTV2、DstillBERT)、三个有效的检索者(NACT)提出两个主要挑战:(1) 当在PRF进程前后进行内,不管一个最深入的、最精确的实验和最精确的RF结果(SBILM),以及从一个州和最精确的后进行。