We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a given question. This task requires joint modeling of retrieved passages, as models should not repeatedly retrieve passages containing the same answer at the cost of missing a different valid answer. Prior work focusing on single-answer retrieval is limited as it cannot reason about the set of passages jointly. In this paper, we introduce JPR, a joint passage retrieval model focusing on reranking. To model the joint probability of the retrieved passages, JPR makes use of an autoregressive reranker that selects a sequence of passages, equipped with novel training and decoding algorithms. Compared to prior approaches, JPR achieves significantly better answer coverage on three multi-answer datasets. When combined with downstream question answering, the improved retrieval enables larger answer generation models since they need to consider fewer passages, establishing a new state-of-the-art.
翻译:我们研究的是多答案检索,这是一个探索不足的问题,需要检索多个不同的答案来回答一个特定的问题。这项任务要求对检索到的通道进行联合建模,因为模型不应重复检索含有相同答案的通道,而以丢失一个不同的有效答案为代价。以前侧重于单答检索的工作是有限的,因为它不能共同解释一组通道的原因。在本文中,我们引入了一个联合通道检索模型,即一个侧重于重新排位的联合通道检索模型。为模拟检索到通道的共同概率,JPR使用一个自动递后重排器,选择一系列通道,配有新的培训和解码算法。与以前的方法相比,JPR在三个多答数据集上实现的回答覆盖要好得多。在与下游回答相结合时,改进的检索使更大的问答生成模型能够进行,因为它们需要考虑较少的通道,建立一个新的状态。