Systems for knowledge-intensive tasks such as open-domain question answering (QA) usually consist of two stages: efficient retrieval of relevant documents from a large corpus and detailed reading of the selected documents to generate answers. Retrievers and readers are usually modeled separately, which necessitates a cumbersome implementation and is hard to train and adapt in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs Retrieval as Attention (ReAtt), and end-to-end training solely based on supervision from the end QA task. We demonstrate for the first time that a single model trained end-to-end can achieve both competitive retrieval and QA performance, matching or slightly outperforming state-of-the-art separately trained retrievers and readers. Moreover, end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings, making our model a simple and adaptable solution for knowledge-intensive tasks. Code and models are available at https://github.com/jzbjyb/ReAtt.
翻译:知识密集型任务系统,如开放式答题(QA)通常由两个阶段组成:从大堆中有效检索相关文件,并详细阅读选定文件以生成答案。探索者和读者通常分别建模,这要求执行繁琐,难以以端对端的方式培训和适应。在本文件中,我们重新审视这一设计,避免单独的架构和培训,而只使用一个单一的变换器,该变换器作为注意(ReAtt)进行检索,并仅根据最终QA任务的监督进行端对端培训。我们第一次证明,一个经过培训的终端对端模型既可实现竞争性检索,也可实现质量A业绩,匹配或稍优于最先进的单独培训的检索者和读者。此外,终端对端适应可大大提升其在受监管和未完成的环境下的外部数据集的性能,使我们的模型成为知识密集型任务的简单和适应性解决方案。代码和模型可在https://github.com/jzyb.rejjjj. https://gs://github/reyb.jjjjjj.