We propose a straightforward vocabulary adaptation scheme to extend the language capacity of multilingual machine translation models, paving the way towards efficient continual learning for multilingual machine translation. Our approach is suitable for large-scale datasets, applies to distant languages with unseen scripts, incurs only minor degradation on the translation performance for the original language pairs and provides competitive performance even in the case where we only possess monolingual data for the new languages.
翻译:我们提出一个直接的词汇适应计划,以扩大多语种机器翻译模式的语言能力,为高效地持续学习多语种机器翻译铺平道路。 我们的方法适合大规模数据集,适用于有看不见脚本的遥远语言,只造成原始语言对口翻译表现的轻微退化,并且提供竞争性的绩效,即使我们只掌握新语言的单一语言数据。