通过双向培训改进神经机器翻译 (Improving Neural Machine Translation by Bidirectional Training)

We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation. Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally. To achieve bidirectional updating, we simply reconstruct the training samples from "src$\rightarrow$tgt" to "src+tgt$\rightarrow$tgt+src" without any complicated model modifications. Notably, our approach does not increase any parameters or training steps, requiring the parallel data merely. Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs (data sizes range from 160K to 38M) significantly higher. Encouragingly, our proposed model can complement existing data manipulation strategies, i.e. back translation, data distillation, and data diversification. Extensive analyses show that our approach functions as a novel bilingual code-switcher, obtaining better bilingual alignment.

翻译：我们提出了一个简单而有效的培训前战略 -- -- 神经机翻译的双向培训。具体地说, 我们双向更新早期的模型参数, 然后正常地调整模型。为了实现双向更新, 我们只是将培训样本从“ rc$\rightrow$tgt” 重建为“ src+tgt$\rightrow$tgt+src”, 而不做任何复杂的模型修改。值得注意的是, 我们的方法并不增加任何参数或培训步骤, 只需要平行的数据。实验结果表明, 双向将SOTA 神经机翻译的性能推向对8对语言( 数据大小从 160K 到 38M) 的15个翻译任务。令人欣慰的是, 我们提议的模型可以补充现有的数据操纵战略, 即背翻译、数据蒸馏和数据多样化。广泛的分析显示, 我们的方法功能是新颖的双语代码转换器, 获得更好的双语校准。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日