改进零热问题生成的通过通道检索 (Improving Passage Retrieval with Zero-Shot Question Generation)

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

翻译：我们建议了一种简单而有效的重新排序方法来改进开放式答题中的通道检索。重新排序的重新排序重新排序者检索了带有零光问题生成模型的段落,该模型使用预先培训的语言模型来计算以检索通道为条件的输入问题的概率。这种方法可以在任何检索方法(例如神经或关键词)之外(例如基于神经或关键词的)应用,并不需要任何具体领域或任务的培训(因此,预计它会更好地概括到数据分布的变化),并且提供了查询和通道之间丰富的交叉注意(即它必须解释问题中的每个符号 ) 。在对一些开放域检索数据集进行评估时,我们的重新排序者将强大的不受监督的检索模型提高了6%-18%的绝对值和强大的监督模型,在前20个通道检索精确度方面达到12%。我们还在完全开放的问题上获得了新的最新结果,只需简单地将新的重新排序器添加到现有的模型中,而没有进一步的变化。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日