推动右按钮:质量估算的反向评估 (Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation) - 专知论文

会员服务 ·

0

估计/估计量 · 机器翻译 · 相关系数 · 均值 · Performer ·

2021 年 9 月 22 日

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

翻译：推动右按钮:质量估算的反向评估

Diptesh Kanojia,Marina Fomicheva,Tharindu Ranasinghe,Frédéric Blain,Constantin Orăsan,Lucia Specia

from arxiv, Accepted to WMT 2021 Conference co-located with EMNLP 2021. 14 pages with a 4 page appendix

Current Machine Translation (MT) systems achieve very good results on a growing variety of language pairs and datasets. However, they are known to produce fluent translation outputs that can contain important meaning errors, thus undermining their reliability in practice. Quality Estimation (QE) is the task of automatically assessing the performance of MT systems at test time. Thus, in order to be useful, QE systems should be able to detect such errors. However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements. In this work, we bridge this gap by proposing a general methodology for adversarial testing of QE for MT. First, we show that despite a high correlation with human judgements achieved by the recent SOTA, certain types of meaning errors are still problematic for QE to detect. Second, we show that on average, the ability of a given model to discriminate between meaning-preserving and meaning-altering perturbations is predictive of its overall performance, thus potentially allowing for comparing QE systems without relying on manual quality annotation.

翻译：目前的机器翻译系统在越来越多的多种语文配对和数据集上取得了非常好的结果,然而,据知它们能够产生流畅的翻译产出,其中含有重要的含义错误,从而损害其在实践中的可靠性。质量估计(QE)是测试时自动评估MT系统性能的任务。因此,为了有用,QE系统应当能够发现这些错误。然而,在目前的评价做法中,这种能力尚有待测试,因为只有从与人类判断的相关性的角度来评估QE系统。在这项工作中,我们提出对质测试MTQE的一般方法来弥补这一差距。首先,我们表明,尽管与最近的SOTA的人类判断有高度的相关性,但某些类型的意义错误仍然对QE的检测有问题。第二,我们表明,平均而言,某一模式区分保留意义和改变意义之间的差别的能力是对其总体性能的预测,因此有可能允许在不依赖手动质量说明的情况下比较QE系统。

0

相关内容

估计/估计量

估计/估计量

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【电子书】机器学习实战（Machine Learning in Action），附PDF

【电子书】机器学习实战（Machine Learning in Action），附PDF

专知会员服务

130+阅读 · 2019年11月25日

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

专知会员服务

77+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年11月6日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

Adversarial Metric Attack for Person Re-identification

Adversarial Metric Attack for Person Re-identification

Arxiv

3+阅读 · 2019年1月30日

Context in Neural Machine Translation: A Review of Models and Evaluations

Arxiv

5+阅读 · 2019年1月25日

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

Arxiv

11+阅读 · 2018年12月8日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

Are Generative Classifiers More Robust to Adversarial Attacks?

Are Generative Classifiers More Robust to Adversarial Attacks?

Arxiv

4+阅读 · 2018年7月9日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Where to put the Image in an Image Caption Generator

Arxiv

3+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【电子书】机器学习实战（Machine Learning in Action），附PDF

【电子书】机器学习实战（Machine Learning in Action），附PDF

专知会员服务

130+阅读 · 2019年11月25日

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

专知会员服务

77+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年11月6日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

Adversarial Metric Attack for Person Re-identification

Adversarial Metric Attack for Person Re-identification

Arxiv

3+阅读 · 2019年1月30日

Context in Neural Machine Translation: A Review of Models and Evaluations

Arxiv

5+阅读 · 2019年1月25日

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

Arxiv

11+阅读 · 2018年12月8日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

Are Generative Classifiers More Robust to Adversarial Attacks?

Are Generative Classifiers More Robust to Adversarial Attacks?

Arxiv

4+阅读 · 2018年7月9日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Where to put the Image in an Image Caption Generator

Arxiv

3+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员