与BERT基于BERT的计算后后翻译对名词进行翻译的修复 (Repairing Pronouns in Translation with BERT-Based Post-Editing) - 专知论文

会员服务 ·

0

NMT · entity · Machine Translation · Pair · CASE ·

2021 年 4 月 1 日

Repairing Pronouns in Translation with BERT-Based Post-Editing

翻译：与BERT基于BERT的计算后后翻译对名词进行翻译的修复

from arxiv, Authors were unable to get full test set results

Pronouns are important determinants of a text's meaning but difficult to translate. This is because pronoun choice can depend on entities described in previous sentences, and in some languages pronouns may be dropped when the referent is inferrable from the context. These issues can lead Neural Machine Translation (NMT) systems to make critical errors on pronouns that impair intelligibility and even reinforce gender bias. We investigate the severity of this pronoun issue, showing that (1) in some domains, pronoun choice can account for more than half of a NMT systems' errors, and (2) pronouns have a disproportionately large impact on perceived translation quality. We then investigate a possible solution: fine-tuning BERT on a pronoun prediction task using chunks of source-side sentences, then using the resulting classifier to repair the translations of an existing NMT model. We offer an initial case study of this approach for the Japanese-English language pair, observing that a small number of translations are significantly improved according to human evaluators.

翻译：Pronouns 是文本含义的重要决定因素, 但很难翻译。这是因为 pronnoun 选择取决于前几句中描述的实体, 在有些语言中, 当引用从上下文中可以推断出时, ponnouns 可能会被丢弃。这些问题可以引导神经机器翻译( NMT) 系统在 prononuns 上做出关键错误, 这会损害智能, 甚至强化性别偏见。我们调查了这个Pronoun 问题的严重性, 表明(1) 在某些领域, 代noun 选择可能占 NMT 系统错误的一半以上, (2) 代nouns 会对感知到的翻译质量产生过大的影响。我们然后调查一个可能的解决办法: 利用源边句块对预言预测任务进行微调 BERT, 然后使用由此产生的分类器来修复现有的 NMT 模型的翻译。我们为日- 英语对提供了这一方法的初步案例研究, 指出, 根据人类评价者的意见, 少数翻译得到显著改进。

0

相关内容

NMT

哈工大SCIR 14篇长文被ACL 2021主会/Findings和IJCAI 2021录用

专知会员服务

56+阅读 · 2021年5月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

TranSmart: A Practical Interactive Machine Translation System

Arxiv

0+阅读 · 2021年5月27日

IntelliCAT: Intelligent Machine Translation Post-Editing with Quality Estimation and Translation Suggestion

Arxiv

0+阅读 · 2021年5月25日

Towards Making the Most of BERT in Neural Machine Translation

Arxiv

5+阅读 · 2020年3月26日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Analyzing Uncertainty in Neural Machine Translation

Arxiv

6+阅读 · 2018年2月28日

Improved English to Russian Translation by Neural Suffix Prediction

Arxiv

4+阅读 · 2018年1月11日

Translating Pro-Drop Languages with Reconstruction Models

Arxiv

3+阅读 · 2018年1月10日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

哈工大SCIR 14篇长文被ACL 2021主会/Findings和IJCAI 2021录用

专知会员服务

56+阅读 · 2021年5月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维和高维空间中分析、建模和转换潜在表征

从无人机到数据：揭示边缘计算作为新作战域

可解释人工智能的基础

大规模视觉模型中的基于提示的适应：综述

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

TranSmart: A Practical Interactive Machine Translation System

Arxiv

0+阅读 · 2021年5月27日

IntelliCAT: Intelligent Machine Translation Post-Editing with Quality Estimation and Translation Suggestion

Arxiv

0+阅读 · 2021年5月25日

Towards Making the Most of BERT in Neural Machine Translation

Arxiv

5+阅读 · 2020年3月26日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Analyzing Uncertainty in Neural Machine Translation

Arxiv

6+阅读 · 2018年2月28日

Improved English to Russian Translation by Neural Suffix Prediction

Arxiv

4+阅读 · 2018年1月11日

Translating Pro-Drop Languages with Reconstruction Models

Arxiv

3+阅读 · 2018年1月10日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员