高效升级多语种机器翻译模型,支持更多语文 (Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages)

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages. However, adding new languages requires updating the vocabulary, which complicates the reuse of embeddings. The question of how to reuse existing models while also making architectural changes to provide capacity for both old and new languages has also not been closely studied. In this work, we introduce three techniques that help speed up effective learning of the new languages and alleviate catastrophic forgetting despite vocabulary and architecture mismatches. Our results show that by (1) carefully initializing the network, (2) applying learning rate scaling, and (3) performing data up-sampling, it is possible to exceed the performance of a same-sized baseline model with 30% computation and recover the performance of a larger model trained from scratch with over 50% reduction in computation. Furthermore, our analysis reveals that the introduced techniques help learn the new directions more effectively and alleviate catastrophic forgetting at the same time. We hope our work will guide research into more efficient approaches to growing languages for these MMT models and ultimately maximize the reuse of existing models.

翻译：由于多语种机器翻译(MMT)模式的规模和数量在不断增长,因此,随着数据以更多语言提供,重新使用和升级现有模式以节省计算是自然的。不过,增加新语言需要更新词汇,使嵌入器的再利用复杂化。目前也没有仔细研究如何重新使用现有模型,同时进行建筑变革,为新老语言提供能力的问题。在这项工作中,我们引进了三种技术,帮助加快新语言的有效学习,缓解尽管词汇和结构不匹配的灾难性遗忘。我们的结果显示:(1) 仔细启动网络,(2) 应用学习速度的扩大,(3) 进行数据更新,有可能超过同一规模基线模型的性能,计算30%,并恢复从零开始训练的较大模型的性能,同时减少50%以上。此外,我们的分析表明,引进的技术有助于更有效地学习新方向,并同时减轻灾难性的遗忘。我们希望我们的工作将指导研究工作,以更有效的方法发展这些MMT模型的语文,并最终使现有模型的再利用最大化。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/