神经机器翻译课程学习 (Token-wise Curriculum Learning for Neural Machine Translation)

Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of "easy" samples from training data at the early training stage. This is not always achievable for low-resource languages where the amount of training data is limited. To address such limitation, we propose a novel token-wise curriculum learning approach that creates sufficient amounts of easy samples. Specifically, the model learns to predict a short sub-sequence from the beginning part of each target sentence at the early stage of training, and then the sub-sequence is gradually expanded as the training progresses. Such a new curriculum design is inspired by the cumulative effect of translation errors, which makes the latter tokens more difficult to predict than the beginning ones. Extensive experiments show that our approach can consistently outperform baselines on 5 language pairs, especially for low-resource languages. Combining our approach with sentence-level methods further improves the performance on high-resource languages.

翻译：现有神经机器翻译课程学习方法要求在早期培训阶段从培训数据中抽取足够数量的“容易”样本,对于培训数据数量有限的低资源语言来说,这并不总是可以实现的。为解决这种局限性,我们建议采用新的象征性课程学习方法,以创造足够数量的简易样本。具体地说,模型学会在培训的早期阶段从每个目标句子的开头部分预测一个简短的次序列,然后随着培训的进展,次序列逐渐扩大。这种新的课程设计受到翻译错误的累积效应的启发,使后者的代号比最初的代号更难预测。广泛的实验表明,我们的方法可以始终超过5种语言的基线,特别是低资源语言的基线。将我们的方法与判决级方法结合起来,可以进一步提高高资源语言的绩效。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【经典书】使用机器学习R语言，149页pdf，Practical Machine Learning in R

专知会员服务

24+阅读 · 2021年1月13日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

专知会员服务

191+阅读 · 2020年5月30日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日