In this paper, we introduce MIX : a multi-task deep learning approach to solve Open-Domain Question Answering. First, we design our system as a multi-stage pipeline made of 3 building blocks : a BM25-based Retriever, to reduce the search space; RoBERTa based Scorer and Extractor, to rank retrieved paragraphs and extract relevant spans of text respectively. Eventually, we further improve computational efficiency of our system to deal with the scalability challenge : thanks to multi-task learning, we parallelize the close tasks solved by the Scorer and the Extractor. Our system is on par with state-of-the-art performances on the squad-open benchmark while being simpler conceptually.
翻译:在本文中,我们引入了 MIX : 一种多任务深层次的学习方法来解决 Open- Domain 问题解答。 首先, 我们设计我们的系统是一个多阶段管道,由3个构件组成: 基于 BM25 的重新开发, 以减少搜索空间; 以 RoBERTA 为基础的计分器和提取器, 分别排列检索到的段落和相关的文本范围。 最后, 我们进一步提高了我们的系统的计算效率, 以应对可缩放性挑战 : 由于多任务学习, 我们把计分器和抽取器所完成的近距离任务平行化。 我们的系统在概念上更加简单, 与队空基准上的最新表现相当 。