ROR: 长文件读取机读取理解的读过读取 (RoR: Read-over-Read for Long Document Machine Reading Comprehension) - 专知论文

会员服务 ·

0

机器阅读理解 · Extensibility · ONCE · INFORMS · 相互独立的 ·

2021 年 9 月 14 日

RoR: Read-over-Read for Long Document Machine Reading Comprehension

翻译：ROR: 长文件读取机读取理解的读过读取

Jing Zhao,Junwei Bao,Yifan Wang,Yongwei Zhou,Youzheng Wu,Xiaodong He,Bowen Zhou

from arxiv, Accepted as findings of EMNLP2021

Transformer-based pre-trained models, such as BERT, have achieved remarkable results on machine reading comprehension. However, due to the constraint of encoding length (e.g., 512 WordPiece tokens), a long document is usually split into multiple chunks that are independently read. It results in the reading field being limited to individual chunks without information collaboration for long document machine reading comprehension. To address this problem, we propose RoR, a read-over-read method, which expands the reading field from chunk to document. Specifically, RoR includes a chunk reader and a document reader. The former first predicts a set of regional answers for each chunk, which are then compacted into a highly-condensed version of the original document, guaranteeing to be encoded once. The latter further predicts the global answers from this condensed document. Eventually, a voting strategy is utilized to aggregate and rerank the regional and global answers for final prediction. Extensive experiments on two benchmarks QuAC and TriviaQA demonstrate the effectiveness of RoR for long document reading. Notably, RoR ranks 1st place on the QuAC leaderboard (https://quac.ai/) at the time of submission (May 17th, 2021).

翻译：BERT等基于预先训练的变换器模型在机器阅读理解上取得了显著成果。然而,由于编码长度的限制(例如512 WordPiece 符号512 WordPiece 符号),一个长的文档通常被分成多个独立阅读的块块。在阅读字段中,其结果仅限于单个块,而没有长期文档机读理解的信息协作。为了解决这一问题,我们提议了将阅读字段从块块扩大为文档的读取方法RoR。具体地说,RoR包括一个大块阅读器和一个文件阅读器。前一个文件预测了每个块的一套区域答案,然后将其压缩成一个高度隐蔽的原始文件版本,保证一次性加密。后一个文件进一步预测了从这一压缩文档中得出的全球答案。最后,我们利用了一种投票战略来汇总和重新排列区域和全球最后预测的答案。在QuAC和TriviaQA两个基准上进行的广泛实验,展示了罗尔对长期文件阅读的有效性。显著的是,RoR在QuAC领导人的17-May21号(httpsqualbal)上排名第1位(httpsqual21)。

0

相关内容

机器阅读理解

机器阅读理解

包括微软、CMU、Stanford在内的顶级人工智能专家和学者们正在研究更复杂的任务：让机器像人类一样阅读文本，进而根据对该文本的理解来回答问题。这种阅读理解就像是让计算机来做我们高考英语的阅读理解题。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

专知会员服务

13+阅读 · 2020年11月7日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

专知会员服务

103+阅读 · 2020年4月22日

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

专知会员服务

75+阅读 · 2020年4月6日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【NLP| 推荐文章】神经网络方法的机器阅读理解：方法与趋势（Neural Machine Reading Comprehension：Methods and Trends）

专知会员服务

41+阅读 · 2019年11月24日

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

专知会员服务

26+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

专知会员服务

29+阅读 · 2019年10月13日

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

弱监督语义分割最新方法资源列表

弱监督语义分割最新方法资源列表

专知

9+阅读 · 2019年2月26日

自然语言处理常见数据集、论文最全整理分享

自然语言处理常见数据集、论文最全整理分享

深度学习与NLP

11+阅读 · 2019年1月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ERROR: GLEW initalization error: Missing GL version

ERROR: GLEW initalization error: Missing GL version

深度强化学习实验室

9+阅读 · 2018年6月13日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

【推荐】深度学习时序处理文献列表

【推荐】深度学习时序处理文献列表

机器学习研究会

7+阅读 · 2017年11月29日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

A Study of the Tasks and Models in Machine Reading Comprehension

A Study of the Tasks and Models in Machine Reading Comprehension

Arxiv

8+阅读 · 2020年1月23日

Unsupervised Domain Adaptation on Reading Comprehension

Arxiv

5+阅读 · 2019年11月13日

Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning

Arxiv

5+阅读 · 2019年9月5日

Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering

Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering

Arxiv

4+阅读 · 2018年11月29日

Visual Question Answering as Reading Comprehension

Arxiv

3+阅读 · 2018年11月29日

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Arxiv

3+阅读 · 2018年11月15日

Knowledge Based Machine Reading Comprehension

Knowledge Based Machine Reading Comprehension

Arxiv

4+阅读 · 2018年9月12日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Arxiv

10+阅读 · 2018年4月25日

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Arxiv

4+阅读 · 2018年4月23日

VIP会员

文章信息

相关主题

机器阅读理解

相互独立的

相关VIP内容

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

专知会员服务

13+阅读 · 2020年11月7日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

专知会员服务

103+阅读 · 2020年4月22日

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

【硬核书】金融数学C++编程，411页pdf，C++ for Financial Mathematics

专知会员服务

75+阅读 · 2020年4月6日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【NLP| 推荐文章】神经网络方法的机器阅读理解：方法与趋势（Neural Machine Reading Comprehension：Methods and Trends）

专知会员服务

41+阅读 · 2019年11月24日

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

专知会员服务

26+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

专知会员服务

29+阅读 · 2019年10月13日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

弱监督语义分割最新方法资源列表

弱监督语义分割最新方法资源列表

专知

9+阅读 · 2019年2月26日

自然语言处理常见数据集、论文最全整理分享

自然语言处理常见数据集、论文最全整理分享

深度学习与NLP

11+阅读 · 2019年1月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ERROR: GLEW initalization error: Missing GL version

ERROR: GLEW initalization error: Missing GL version

深度强化学习实验室

9+阅读 · 2018年6月13日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

【推荐】深度学习时序处理文献列表

【推荐】深度学习时序处理文献列表

机器学习研究会

7+阅读 · 2017年11月29日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

A Study of the Tasks and Models in Machine Reading Comprehension

A Study of the Tasks and Models in Machine Reading Comprehension

Arxiv

8+阅读 · 2020年1月23日

Unsupervised Domain Adaptation on Reading Comprehension

Arxiv

5+阅读 · 2019年11月13日

Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning

Arxiv

5+阅读 · 2019年9月5日

Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering

Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering

Arxiv

4+阅读 · 2018年11月29日

Visual Question Answering as Reading Comprehension

Arxiv

3+阅读 · 2018年11月29日

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Arxiv

3+阅读 · 2018年11月15日

Knowledge Based Machine Reading Comprehension

Knowledge Based Machine Reading Comprehension

Arxiv

4+阅读 · 2018年9月12日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Arxiv

10+阅读 · 2018年4月25日

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Arxiv

4+阅读 · 2018年4月23日

微信扫码咨询专知VIP会员