改进零热问题生成的通过通道检索 (Improving Passage Retrieval with Zero-Shot Question Generation)

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

翻译：我们建议了一种简单而有效的重新排序方法来改进开放式答题中的通道检索。重新排序的重新排序重新排序者检索了带有零光问题生成模型的段落,该模型使用预先培训的语言模型来计算以检索通道为条件的输入问题的概率。这种方法可以在任何检索方法(例如神经或关键词)之外(例如基于神经或关键词的)应用,并不需要任何具体领域或任务的培训(因此,预计它会更好地概括到数据分布的变化),并且提供了查询和通道之间丰富的交叉注意(即它必须解释问题中的每个符号 ) 。在对一些开放域检索数据集进行评估时,我们的重新排序者将强大的不受监督的检索模型提高了6%-18%的绝对值和强大的监督模型,在前20个通道检索精确度方面达到12%。我们还在完全开放的问题上获得了新的最新结果,只需简单地将新的重新排序器添加到现有的模型中,而没有进一步的变化。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日