Machine reading comprehension (MRC) aims to teach machines to read and comprehend human languages, which is a long-standing goal of natural language processing (NLP). With the burst of deep neural networks and the evolution of contextualized language models (CLMs), the research of MRC has experienced two significant breakthroughs. MRC and CLM, as a phenomenon, have a great impact on the NLP community. In this survey, we provide a comprehensive and comparative review on MRC covering overall research topics about 1) the origin and development of MRC and CLM, with a particular focus on the role of CLMs; 2) the impact of MRC and CLM to the NLP community; 3) the definition, datasets, and evaluation of MRC; 4) general MRC architecture and technical methods in the view of two-stage Encoder-Decoder solving architecture from the insights of the cognitive process of humans; 5) previous highlights, emerging topics, and our empirical analysis, among which we especially focus on what works in different periods of MRC researches. We propose a full-view categorization and new taxonomies on these topics. The primary views we have arrived at are that 1) MRC boosts the progress from language processing to understanding; 2) the rapid improvement of MRC systems greatly benefits from the development of CLMs; 3) the theme of MRC is gradually moving from shallow text matching to cognitive reasoning.
翻译:机器阅读理解(MRC)旨在教授机器阅读和理解人类语言,这是自然语言处理的长期目标。随着深层神经网络的破灭和背景化语言模式的演进,MRC的研究取得了两个重大突破。MRC和CLM作为一种现象,对NLP群体产生了巨大影响。在这次调查中,我们对MRC进行全面和比较性审查,涉及以下总体研究课题:(1) MRC和CLM的起源和发展,特别侧重于CLM的作用;(2) MRC和CLM对NLP社区的影响;(3) MRC的定义、数据集和评估;(4) MRC的一般结构和技术方法,从人类认知过程的洞察觉中两阶段的Encoder-Decoder解决结构;(5) 先前的重点、新出现的主题和我们的经验分析,其中我们特别侧重于MRC研究的不同时期的工作成果;(2) MRC的完整分类和新的税制对NL的演变过程进行了全面分析;(2) MRC语言的快速升级;(2) 我们从C-Dec-Descrial 进入了MRC的演变过程;(2) 我们从MRC的文本逐渐改进了MRC的理论;(2) 取得了MRC的进度;(2) 主要观点。