翻译何时需要背景? 数据驱动的多语言探索 (When Does Translation Require Context? A Data-driven, Multilingual Exploration) - 专知论文

会员服务 ·

0

机器翻译 · MoDELS · 可辨认的 · Machine Translation · state-of-the-art ·

2021 年 9 月 15 日

When Does Translation Require Context? A Data-driven, Multilingual Exploration

翻译：翻译何时需要背景? 数据驱动的多语言探索

Kayo Yin,Patrick Fernandes,André F. T. Martins,Graham Neubig

Although proper handling of discourse phenomena significantly contributes to the quality of machine translation (MT), common translation quality metrics do not adequately capture them. Recent works in context-aware MT attempt to target a small set of these phenomena during evaluation. In this paper, we propose a new metric, P-CXMI, which allows us to identify translations that require context systematically and confirm the difficulty of previously studied phenomena as well as uncover new ones that have not been addressed in previous work. We then develop the Multilingual Discourse-Aware (MuDA) benchmark, a series of taggers for these phenomena in 14 different language pairs, which we use to evaluate context-aware MT. We find that state-of-the-art context-aware MT models find marginal improvements over context-agnostic models on our benchmark, which suggests current models do not handle these ambiguities effectively. We release code and data to invite the MT research community to increase efforts on context-aware translation on discourse phenomena and languages that are currently overlooked.

翻译：虽然妥善处理讨论现象对机器翻译的质量有重大贡献,但通用翻译质量指标并不能充分捕捉到这些现象。最近对背景有认识的MT工程试图在评估期间针对其中的一小部分现象。在本文件中,我们提出了一个新的指标P-CXMI, 使我们能够系统地确定需要背景的翻译,并证实以前研究过的现象的困难,以及发现以前工作中没有处理过的新的现象。然后,我们用14种不同的语言为这些现象制定多种语言的Discoin-Aware(MuDA)基准,这是我们用来评估背景有觉识的MT系列标签。我们发现,最先进的环境有觉识的MT模型比我们基准上的背景无识模型发现了边际的改进,这表明目前的模型没有有效地处理这些模糊之处。我们发布代码和数据,请MT研究界就目前被忽视的语种和语种的语种,加大对背景有觉的翻译工作。

0

相关内容

机器翻译

机器翻译，又称为自动翻译，是利用计算机将一种自然语言(源语言)转换为另一种自然语言(目标语言)的过程。它是计算语言学的一个分支，是人工智能的终极目标之一，具有重要的科学研究价值。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【SIGMOD2020-腾讯】Web规模本体可扩展构建

【SIGMOD2020-腾讯】Web规模本体可扩展构建

专知会员服务

31+阅读 · 2020年4月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：位置感知的长序列会话推荐

LibRec 精选：位置感知的长序列会话推荐

LibRec智能推荐

3+阅读 · 2019年5月17日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

How Do Neural Sequence Models Generalize? Local and Global Context Cues for Out-of-Distribution Prediction

Arxiv

0+阅读 · 2021年11月4日

FuxiCTR: An Open Benchmark for Click-Through Rate Prediction

Arxiv

8+阅读 · 2020年9月12日

Context in Neural Machine Translation: A Review of Models and Evaluations

Arxiv

5+阅读 · 2019年1月25日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model

Arxiv

3+阅读 · 2018年6月12日

Classical Structured Prediction Losses for Sequence to Sequence Learning

Arxiv

6+阅读 · 2018年5月24日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Fine-Grained Attention Mechanism for Neural Machine Translation

Arxiv

4+阅读 · 2018年4月3日

Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation

Arxiv

3+阅读 · 2018年3月20日

What Level of Quality can Neural Machine Translation Attain on Literary Text?

Arxiv

5+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

Machine Translation

state-of-the-art

相关VIP内容

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【SIGMOD2020-腾讯】Web规模本体可扩展构建

【SIGMOD2020-腾讯】Web规模本体可扩展构建

专知会员服务

31+阅读 · 2020年4月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：位置感知的长序列会话推荐

LibRec 精选：位置感知的长序列会话推荐

LibRec智能推荐

3+阅读 · 2019年5月17日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

相关论文

How Do Neural Sequence Models Generalize? Local and Global Context Cues for Out-of-Distribution Prediction

Arxiv

0+阅读 · 2021年11月4日

FuxiCTR: An Open Benchmark for Click-Through Rate Prediction

Arxiv

8+阅读 · 2020年9月12日

Context in Neural Machine Translation: A Review of Models and Evaluations

Arxiv

5+阅读 · 2019年1月25日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model

Arxiv

3+阅读 · 2018年6月12日

Classical Structured Prediction Losses for Sequence to Sequence Learning

Arxiv

6+阅读 · 2018年5月24日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Fine-Grained Attention Mechanism for Neural Machine Translation

Arxiv

4+阅读 · 2018年4月3日

Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation

Arxiv

3+阅读 · 2018年3月20日

What Level of Quality can Neural Machine Translation Attain on Literary Text?

Arxiv

5+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员