MvSR-NAT:非自动递减机器翻译的多视图子集常规化 (MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation) - 专知论文

会员服务 ·

0

正则化项 · CMLM · Machine Translation · MoDELS · 掩码 ·

2021 年 8 月 19 日

MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation

翻译：MvSR-NAT:非自动递减机器翻译的多视图子集常规化

Pan Xie,Zexian Li,Xiaohui Hu

Conditional masked language models (CMLM) have shown impressive progress in non-autoregressive machine translation (NAT). They learn the conditional translation model by predicting the random masked subset in the target sentence. Based on the CMLM framework, we introduce Multi-view Subset Regularization (MvSR), a novel regularization method to improve the performance of the NAT model. Specifically, MvSR consists of two parts: (1) \textit{shared mask consistency}: we forward the same target with different mask strategies, and encourage the predictions of shared mask positions to be consistent with each other. (2) \textit{model consistency}, we maintain an exponential moving average of the model weights, and enforce the predictions to be consistent between the average model and the online model. Without changing the CMLM-based architecture, our approach achieves remarkable performance on three public benchmarks with 0.36-1.14 BLEU gains over previous NAT models. Moreover, compared with the stronger Transformer baseline, we reduce the gap to 0.01-0.44 BLEU scores on small datasets (WMT16 RO$\leftrightarrow$EN and IWSLT DE$\rightarrow$EN).

翻译：有条件隐形语言模型(CMLM)在非自动隐形机器翻译(NAT)方面取得了令人印象深刻的进展。它们通过预测目标句中的随机掩码子项学习了有条件翻译模型。根据CMM框架,我们引入了多视图子子常规化(MvSR),这是改进NAT模型绩效的一种新颖的正规化方法。具体地说,MvSR由两部分组成:(1)\textit{共享遮罩一致性}:我们用不同的遮罩战略推进同一目标,并鼓励预测共同遮罩位置相互一致。(2)\textit{模范一致性},我们保持模型重量的指数移动平均值,并强制执行预测,使之在平均模型和在线模型之间保持一致。在不改变基于CMLMM模型的架构的情况下,我们的方法在三个公共基准上取得了显著的绩效,即0.36-1.14 BLEU比前NAT模型的收益。此外,与更强大的变换基准相比,我们将小型数据集的差距缩小到0.01-044 BLEU的分数(WMT16 RONARRRINSLT)。

0

相关内容

正则化项

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

专知会员服务

25+阅读 · 2020年7月28日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

吴恩达新书《Machine Learning Yearning》完整中文版

吴恩达新书《Machine Learning Yearning》完整中文版

专知会员服务

147+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

AAAI 2019：一文看全微软亚洲研究院 27 篇重点论文

AAAI 2019：一文看全微软亚洲研究院 27 篇重点论文

新智元

3+阅读 · 2019年1月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision

Arxiv

0+阅读 · 2021年10月14日

Reliable Graph Neural Networks via Robust Aggregation

Arxiv

9+阅读 · 2020年10月29日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection

Arxiv

4+阅读 · 2018年9月23日

Context-Aware Neural Machine Translation Learns Anaphora Resolution

Arxiv

3+阅读 · 2018年5月25日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Fine-Grained Attention Mechanism for Neural Machine Translation

Arxiv

4+阅读 · 2018年4月3日

Self-Attentive Residual Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年3月22日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Towards Neural Phrase-based Machine Translation

Arxiv

4+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

专知会员服务

25+阅读 · 2020年7月28日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

吴恩达新书《Machine Learning Yearning》完整中文版

吴恩达新书《Machine Learning Yearning》完整中文版

专知会员服务

147+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

AAAI 2019：一文看全微软亚洲研究院 27 篇重点论文

AAAI 2019：一文看全微软亚洲研究院 27 篇重点论文

新智元

3+阅读 · 2019年1月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision

Arxiv

0+阅读 · 2021年10月14日

Reliable Graph Neural Networks via Robust Aggregation

Arxiv

9+阅读 · 2020年10月29日

Progressive Sparse Local Attention for Video object detection

Arxiv

4+阅读 · 2019年3月21日

Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection

Arxiv

4+阅读 · 2018年9月23日

Context-Aware Neural Machine Translation Learns Anaphora Resolution

Arxiv

3+阅读 · 2018年5月25日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Fine-Grained Attention Mechanism for Neural Machine Translation

Arxiv

4+阅读 · 2018年4月3日

Self-Attentive Residual Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年3月22日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Towards Neural Phrase-based Machine Translation

Arxiv

4+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员