评价机器阅读理解的语义变换修改 (Semantics Altering Modifications for Evaluating Comprehension in Machine Reading) - 专知论文

会员服务 ·

0

Processing（编程语言） · Performer · state-of-the-art · domain shift · MoDELS ·

2021 年 6 月 15 日

Semantics Altering Modifications for Evaluating Comprehension in Machine Reading

翻译：评价机器阅读理解的语义变换修改

Viktor Schlegel,Goran Nenadic,Riza Batista-Navarro

from arxiv, AAAI 2021, final version. 7 pages content + 2 pages references

Advances in NLP have yielded impressive results for the task of machine reading comprehension (MRC), with approaches having been reported to achieve performance comparable to that of humans. In this paper, we investigate whether state-of-the-art MRC models are able to correctly process Semantics Altering Modifications (SAM): linguistically-motivated phenomena that alter the semantics of a sentence while preserving most of its lexical surface form. We present a method to automatically generate and align challenge sets featuring original and altered examples. We further propose a novel evaluation methodology to correctly assess the capability of MRC systems to process these examples independent of the data they were optimised on, by discounting for effects introduced by domain shift. In a large-scale empirical study, we apply the methodology in order to evaluate extractive MRC models with regard to their capability to correctly process SAM-enriched data. We comprehensively cover 12 different state-of-the-art neural architecture configurations and four training datasets and find that -- despite their well-known remarkable performance -- optimised models consistently struggle to correctly process semantically altered data.

翻译：机器阅读理解(MRC)任务的进展令人印象深刻,据报告,在机器阅读理解(MRC)任务方面,我们取得了令人印象深刻的成果,并采取了与人类相似的绩效。在本文件中,我们调查了最先进的MRC模型是否能够正确处理语义变换改变(SAM):改变句子语义的由语言驱动的现象,同时保留其大部分地表法形式。我们提出了一个自动生成和调整以原始和变换实例为特点的成套挑战的方法。我们进一步提出一种新的评估方法,以正确评估MRC系统处理这些例子的能力,而这种能力独立于它们所选取的数据。在一项大规模的经验研究中,我们应用这一方法来评价采掘MRC模型正确处理SAM丰富数据的能力。我们全面覆盖了12个不同的状态的神经结构配置和4个培训数据集。我们发现,尽管它们有众所周知的显著性能,但最优化的模型始终在努力争取正确处理过程语义变数据。

0

相关内容

Processing（编程语言）

Processing（编程语言）

Processing 是一门开源编程语言和与之配套的集成开发环境（IDE）的名称。Processing 在电子艺术和视觉设计社区被用来教授编程基础，并运用于大量的新媒体和互动艺术作品中。

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【NLP| 推荐文章】神经网络方法的机器阅读理解：方法与趋势（Neural Machine Reading Comprehension：Methods and Trends）

专知会员服务

41+阅读 · 2019年11月24日

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A Dataset for Answering Time-Sensitive Questions

Arxiv

0+阅读 · 2021年8月13日

Semantic Answer Similarity for Evaluating Question Answering Models

Arxiv

1+阅读 · 2021年8月13日

Model Compression for Domain Adaptation through Causal Effect Estimation

Arxiv

0+阅读 · 2021年8月11日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

A Study of the Tasks and Models in Machine Reading Comprehension

A Study of the Tasks and Models in Machine Reading Comprehension

Arxiv

8+阅读 · 2020年1月23日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Commonsense Knowledge + BERT for Level 2 Reading Comprehension Ability Test

Arxiv

4+阅读 · 2019年9月8日

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Arxiv

3+阅读 · 2018年11月15日

Knowledge Based Machine Reading Comprehension

Knowledge Based Machine Reading Comprehension

Arxiv

4+阅读 · 2018年9月12日

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

Arxiv

4+阅读 · 2017年11月15日

VIP会员

文章信息

相关主题

Processing（编程语言）

state-of-the-art

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【NLP| 推荐文章】神经网络方法的机器阅读理解：方法与趋势（Neural Machine Reading Comprehension：Methods and Trends）

专知会员服务

41+阅读 · 2019年11月24日

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

最新《扩散模型原理》新书，470页pdf

无人机作战：演进、创新与未来战场

AI 智能体简史

多模态空间推理在大模型时代：综述与基准测试

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A Dataset for Answering Time-Sensitive Questions

Arxiv

0+阅读 · 2021年8月13日

Semantic Answer Similarity for Evaluating Question Answering Models

Arxiv

1+阅读 · 2021年8月13日

Model Compression for Domain Adaptation through Causal Effect Estimation

Arxiv

0+阅读 · 2021年8月11日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

A Study of the Tasks and Models in Machine Reading Comprehension

A Study of the Tasks and Models in Machine Reading Comprehension

Arxiv

8+阅读 · 2020年1月23日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Commonsense Knowledge + BERT for Level 2 Reading Comprehension Ability Test

Arxiv

4+阅读 · 2019年9月8日

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Arxiv

3+阅读 · 2018年11月15日

Knowledge Based Machine Reading Comprehension

Knowledge Based Machine Reading Comprehension

Arxiv

4+阅读 · 2018年9月12日

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

Arxiv

4+阅读 · 2017年11月15日

微信扫码咨询专知VIP会员