Knowledge-intensive tasks such as question answering often require assimilating information from different sections of large inputs such as books or article collections. We propose ReadTwuce, a simple and effective technique that combines several strengths of prior approaches to model long-range dependencies with Transformers. The main idea is to read text in small segments, in parallel, summarizing each segment into a memory table to be used in a second read of the text. We show that the method outperforms models of comparable size on several question answering (QA) datasets and sets a new state of the art on the challenging NarrativeQA task, with questions about entire books. Source code and pre-trained checkpoints for ReadTwice can be found at https://goo.gle/research-readtwice.
翻译:诸如回答问题等知识密集型任务往往需要从大量投入的不同部分,如书籍或文章收藏中收集信息。我们提议采用“阅读图解”,这是一种简单而有效的技术,结合了以前与变换者模式的远程依赖性模式的几种优势。主要的想法是阅读小部分的文字,同时将每个部分总结成一个记忆表,在文本的第二个读物中使用。我们表明,该方法在几个问题回答(QA)数据集上优于类似规模的模型,并针对具有挑战性的“叙述”A任务设定了新水平,并提出了关于整个书籍的问题。在https://goo.gle/research-readtwice上可以找到“阅读图解”的源代码和预先训练的检查点。