FiE:通过利用编码器中的早期熔化利用杠杆作用来建立全球概率空间,用于 Open-Domain 问题解答 (FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering)

Generative models have recently started to outperform extractive models in Open Domain Question Answering, largely by leveraging their decoder to attend over multiple encoded passages and combining their information. However, generative models tend to be larger than extractive models due to the need for a decoder, run slower during inference due to auto-regressive decoder beam search, and their generated output often suffers from hallucinations. We propose to extend transformer encoders with the ability to fuse information from multiple passages, using global representation to provide cross-sample attention over all tokens across samples. Furthermore, we propose an alternative answer span probability calculation to better aggregate answer scores in the global space of all samples. Using our proposed method, we outperform the current state-of-the-art method by $2.5$ Exact Match score on the Natural Question dataset while using only $25\%$ of parameters and $35\%$ of the latency during inference, and $4.4$ Exact Match on WebQuestions dataset. When coupled with synthetic data augmentation, we outperform larger models on the TriviaQA dataset as well. The latency and parameter savings of our method make it particularly attractive for open-domain question answering, as these models are often compute-intensive.

翻译：生成模型最近开始在开放域域问答中优于采掘模型,主要是利用其解码器处理多个编码段落,并合并其信息。然而,由于需要解码器,基因模型往往大于采掘模型,由于自动递增脱coder光束搜索,在推断过程中速度较慢,其生成的输出往往有幻觉。我们提议扩大变压器编码器,使其有能力将多个通道的信息融合起来,利用全球代表器对样本中的所有代号提供交叉抽样关注。此外,我们提议一个替代答案,包括概率计算,以便在所有样本的全球空间中更好地汇总解码分数。我们采用拟议方法,在自然问题数据集上比当前最先进的方法高出2.5美元,同时仅使用25美元参数和35美元拉特元,在网络数据集上则使用4.4美元开放匹配。在合成数据增强的同时,我们在三维A模型上超越了更大的模型,这些模型通常作为具有吸引力的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日