神经机器翻译自导课程学习 (Self-Guided Curriculum Learning for Neural Machine Translation)

In the field of machine learning, the well-trained model is assumed to be able to recover the training labels, i.e. the synthetic labels predicted by the model should be as close to the ground-truth labels as possible. Inspired by this, we propose a self-guided curriculum strategy to encourage the learning of neural machine translation (NMT) models to follow the above recovery criterion, where we cast the recovery degree of each training example as its learning difficulty. Specifically, we adopt the sentence level BLEU score as the proxy of recovery degree. Different from existing curricula relying on linguistic prior knowledge or third-party language models, our chosen learning difficulty is more suitable to measure the degree of knowledge mastery of the NMT models. Experiments on translation benchmarks, including WMT14 English$\Rightarrow$German and WMT17 Chinese$\Rightarrow$English, demonstrate that our approach can consistently improve translation performance against strong baseline Transformer.

翻译：在机器学习领域,假定经过良好培训的模式能够恢复培训标签,即模型预测的合成标签应尽可能接近地面真实标签,因此,我们提出自导课程战略,鼓励学习神经机翻译模型,以遵循上述恢复标准,将每个培训实例的恢复程度作为学习困难。具体地说,我们采用BLEU等级评分作为恢复学位的代名词。与现有课程相比,我们所选择的学习困难更适合衡量NMT模型的知识掌握程度。关于翻译基准的实验,包括WMT14 English$\Liightrowral$和WMT17 WMT17 中文\Riightrowral$英语,表明我们的方法可以不断改进与强大的基线变换器的翻译绩效。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

【课程】纽约大学 DS-GA 1003 Machine Learning

专知会员服务

46+阅读 · 2019年10月29日