Transformer-based pre-trained models, such as BERT, have achieved remarkable results on machine reading comprehension. However, due to the constraint of encoding length (e.g., 512 WordPiece tokens), a long document is usually split into multiple chunks that are independently read. It results in the reading field being limited to individual chunks without information collaboration for long document machine reading comprehension. To address this problem, we propose RoR, a read-over-read method, which expands the reading field from chunk to document. Specifically, RoR includes a chunk reader and a document reader. The former first predicts a set of regional answers for each chunk, which are then compacted into a highly-condensed version of the original document, guaranteeing to be encoded once. The latter further predicts the global answers from this condensed document. Eventually, a voting strategy is utilized to aggregate and rerank the regional and global answers for final prediction. Extensive experiments on two benchmarks QuAC and TriviaQA demonstrate the effectiveness of RoR for long document reading. Notably, RoR ranks 1st place on the QuAC leaderboard (https://quac.ai/) at the time of submission (May 17th, 2021).
翻译:BERT等基于预先训练的变换器模型在机器阅读理解上取得了显著成果。然而,由于编码长度的限制(例如512 WordPiece 符号512 WordPiece 符号),一个长的文档通常被分成多个独立阅读的块块。在阅读字段中,其结果仅限于单个块,而没有长期文档机读理解的信息协作。为了解决这一问题,我们提议了将阅读字段从块块扩大为文档的读取方法RoR。具体地说,RoR包括一个大块阅读器和一个文件阅读器。前一个文件预测了每个块的一套区域答案,然后将其压缩成一个高度隐蔽的原始文件版本,保证一次性加密。后一个文件进一步预测了从这一压缩文档中得出的全球答案。最后,我们利用了一种投票战略来汇总和重新排列区域和全球最后预测的答案。在QuAC和TriviaQA两个基准上进行的广泛实验,展示了罗尔对长期文件阅读的有效性。显著的是,RoR在QuAC领导人的17-May21号(httpsqualbal)上排名第1位(httpsqual21)。