This work presents a novel four-stage open-domain QA pipeline R2-D2 (Rank twice, reaD twice). The pipeline is composed of a retriever, passage reranker, extractive reader, generative reader and a mechanism that aggregates the final prediction from all system's components. We demonstrate its strength across three open-domain QA datasets: NaturalQuestions, TriviaQA and EfficientQA, surpassing state-of-the-art on the first two. Our analysis demonstrates that: (i) combining extractive and generative reader yields absolute improvements up to 5 exact match and it is at least twice as effective as the posterior averaging ensemble of the same models with different parameters, (ii) the extractive reader with fewer parameters can match the performance of the generative reader on extractive QA datasets.
翻译:这项工作提出了一个新的四阶段开放式QA输油管R2-D2(Rank两次,reaD两次),管道由检索器、转机机、采掘阅读器、基因阅读器和从所有系统组件中汇总最后预测的机制组成。我们展示了它在三个开放式QA数据集中的力量:自然问题、TriviaQA和高效QA,在前两个数据中超过了最新水平。我们的分析表明:(一) 将采掘和基因化阅读器结合起来,可以产生绝对的改进,最多达到5个精确匹配,至少比同一模型中具有不同参数的后部或平均组合有效两倍,(二) 参数较少的采掘读者可以与采掘QA数据集的基因阅读器的性能相匹配。