一致性TL:低资源神经机器翻译转移学习一致性建模 (ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation)

Transfer learning is a simple and powerful method that can be used to boost model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer knowledge from a parent model to a child model once via parameter initialization. In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model and encourages prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model. Experimental results on five low-resource NMT tasks demonstrate that ConsistTL results in significant improvements over strong transfer learning baselines, with a gain up to 1.7 BLEU over the existing back-translation model on the widely-used WMT17 Turkish-English benchmark. Further analysis reveals that ConsistTL can improve the inference calibration of the child model. Code and scripts are freely available at https://github.com/NLP2CT/ConsistTL.

翻译：转让学习是一种简单而有力的方法,可以用来促进低资源神经机器翻译(NMT)的模型性能。现有NMT的转移学习方法是静态的,仅通过参数初始化将知识从父型模式转移给儿童模式;在本文件中,我们提议了NMT的新颖的转移学习方法,即ConsistTL,该方法可以在儿童模式培训期间不断从父型模式转移知识。具体来说,ConsistTL为儿童模式的每个培训实例构建了母型的语义等同实例,并鼓励预测父母和儿童之间在这方面的一致性,这相当于在父型模式指导下儿童模式学习的每一个实例。NMT五项低资源任务实验结果显示,ConsistTL在强大的转移学习基线方面有重大改进,在广泛使用的WMT17土耳其-英语基准的现有回译模式上可达1.7BLEU。进一步分析显示,CostTL可以改进儿童模型的推论校准性NCT校准。法典和脚本可以免费在 https://Lgistoth/commps。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日