Generative models have recently started to outperform extractive models in Open Domain Question Answering, largely by leveraging their decoder to attend over multiple encoded passages and combining their information. However, generative models tend to be larger than extractive models due to the need for a decoder, run slower during inference due to auto-regressive decoder beam search, and their generated output often suffers from hallucinations. We propose to extend transformer encoders with the ability to fuse information from multiple passages, using global representation to provide cross-sample attention over all tokens across samples. Furthermore, we propose an alternative answer span probability calculation to better aggregate answer scores in the global space of all samples. Using our proposed method, we outperform the current state-of-the-art method by $2.5$ Exact Match score on the Natural Question dataset while using only $25\%$ of parameters and $35\%$ of the latency during inference, and $4.4$ Exact Match on WebQuestions dataset. When coupled with synthetic data augmentation, we outperform larger models on the TriviaQA dataset as well. The latency and parameter savings of our method make it particularly attractive for open-domain question answering, as these models are often compute-intensive.
翻译:生成模型最近开始在开放域域问答中优于采掘模型,主要是利用其解码器处理多个编码段落,并合并其信息。然而,由于需要解码器,基因模型往往大于采掘模型,由于自动递增脱coder光束搜索,在推断过程中速度较慢,其生成的输出往往有幻觉。我们提议扩大变压器编码器,使其有能力将多个通道的信息融合起来,利用全球代表器对样本中的所有代号提供交叉抽样关注。此外,我们提议一个替代答案,包括概率计算,以便在所有样本的全球空间中更好地汇总解码分数。我们采用拟议方法,在自然问题数据集上比当前最先进的方法高出2.5美元,同时仅使用25美元参数和35美元拉特元,在网络数据集上则使用4.4美元开放匹配。在合成数据增强的同时,我们在三维A模型上超越了更大的模型,这些模型通常作为具有吸引力的模型。