争取使 " 零热神经机器翻译 " 的多语种预科培训达到最大程度 (Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation) - 专知论文

会员服务 ·

0

NMT · Machine Translation · MoDELS · 多语言神经机器翻译 · XLM-R ·

2021 年 10 月 16 日

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation

翻译：争取使 " 零热神经机器翻译 " 的多语种预科培训达到最大程度

Guanhua Chen,Shuming Ma,Yun Chen,Dongdong Zhang,Jia Pan,Wenping Wang,Furu Wei

from arxiv, Preprint

This paper demonstrates that multilingual pretraining, a proper fine-tuning method and a large-scale parallel dataset from multiple auxiliary languages are all critical for zero-shot translation, where the NMT model is tested on source languages unseen during supervised training. Following this idea, we present SixT++, a strong many-to-English NMT model that supports 100 source languages but is trained once with a parallel dataset from only six source languages. SixT++ initializes the decoder embedding and the full encoder with XLM-R large, and then trains the encoder and decoder layers with a simple two-stage training strategy. SixT++ achieves impressive performance on many-to-English translation. It significantly outperforms CRISS and m2m-100, two strong multilingual NMT systems, with an average gain of 7.2 and 5.0 BLEU respectively. Additionally, SixT++ offers a set of model parameters that can be further fine-tuned to develop unsupervised NMT models for low-resource languages. With back-translation on monolingual data of low-resource language, it outperforms all current state-of-the-art unsupervised methods on Nepali and Sinhal for both translating into and from English.

翻译：本文表明,多语种预培训、适当的微调方法和来自多种辅助语言的大规模平行数据集对于零点翻译至关重要,因为NMT模型在受监督的培训期间对源语言进行测试。根据这一想法,我们提出了支持100种源语言的强大多到英语的NMT模型SixT++,这是一个支持100种源语言的强大多到英语的多语言NMT模型,但只用六种源语言的平行数据集培训过一次。6T++将解码嵌入和全部编码器与 XLM-R 大型 XLM-R 连接起来,然后用简单的两阶段培训战略对编码器和解码层进行培训。SixT++在多到英语翻译上取得了令人印象深刻的成绩。它大大超越了CRISS和m2m-100,两个强大的多语言NMT系统,平均收益分别为7.2和5.0 BLEU。此外,S6T++提供了一套示范参数,可以进一步加以调整,以开发不受控制的低资源语言NMT模式,然后用单一语言数据进行后译,它超越了尼泊尔目前和SINSINA和SINSYSYSUV的所有方式。

0

相关内容

NMT

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

专知会员服务

10+阅读 · 2020年1月9日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

图机器学习导论，69页ppt，An introduction to machine learning on graphs

图机器学习导论，69页ppt，An introduction to machine learning on graphs

专知会员服务

382+阅读 · 2019年12月27日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

6+阅读 · 2019年4月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

Towards Making the Most of BERT in Neural Machine Translation

Arxiv

5+阅读 · 2020年3月26日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Scaling Neural Machine Translation

Arxiv

3+阅读 · 2018年6月1日

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Arxiv

6+阅读 · 2018年5月29日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Towards Neural Phrase-based Machine Translation

Arxiv

3+阅读 · 2018年4月18日

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

Arxiv

3+阅读 · 2018年4月18日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

Machine Translation

多语言神经机器翻译

相关VIP内容

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

【Tom Kocmi博士论文】探讨迁移学习在神经机器翻译中的应用，Exploring Benefits of Transfer Learning in Neural Machine Translation

专知会员服务

10+阅读 · 2020年1月9日

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

图机器学习导论，69页ppt，An introduction to machine learning on graphs

图机器学习导论，69页ppt，An introduction to machine learning on graphs

专知会员服务

382+阅读 · 2019年12月27日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

6+阅读 · 2019年4月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Towards Making the Most of BERT in Neural Machine Translation

Arxiv

5+阅读 · 2020年3月26日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Scaling Neural Machine Translation

Arxiv

3+阅读 · 2018年6月1日

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Arxiv

6+阅读 · 2018年5月29日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Towards Neural Phrase-based Machine Translation

Arxiv

3+阅读 · 2018年4月18日

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

Arxiv

3+阅读 · 2018年4月18日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员