End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages. Recent work has successfully trained neural IR systems using only supervised question answering (QA) examples from open-domain datasets. However, despite impressive performance on Wikipedia, neural IR lags behind traditional term matching approaches such as BM25 in more specific and specialized target domains such as COVID-19. Furthermore, given little or no labeled data, effective adaptation of QA systems can also be challenging in such target domains. In this work, we explore the application of synthetically generated QA examples to improve performance on closed-domain retrieval and MRC. We combine our neural IR and MRC systems and show significant improvements in end-to-end QA on the CORD-19 collection over a state-of-the-art open-domain QA baseline.
翻译:终端到终端答题(QA)要求大型文件收集和机器阅读理解(MRC)在检索到的段落上同时进行信息检索(IR),最近的工作成功地培训了神经IR系统,仅使用开放域数据集中受监督的回答问题(QA)实例,然而,尽管维基百科的表现令人印象深刻,神经IR落后于诸如BM25等更具体和专门的目标领域的传统术语匹配方法,如COVID-19。此外,由于标签数据很少或根本没有,有效调整质量A系统在这类目标领域也可能具有挑战性。在这项工作中,我们探索了合成生成的QA实例的应用,以提高闭域检索和MRC的性能。我们将我们的神经IR和MRC系统结合起来,并展示了CRD-19收集的端到端QA在最新开放域QA基线上的显著改进。