Multilingual Neural Machine Translation (MNMT) enables one system to translate sentences from multiple source languages to multiple target languages, greatly reducing deployment costs compared with conventional bilingual systems. The MNMT training benefit, however, is often limited to many-to-one directions. The model suffers from poor performance in one-to-many and many-to-many with zero-shot setup. To address this issue, this paper discusses how to practically build MNMT systems that serve arbitrary X-Y translation directions while leveraging multilinguality with a two-stage training strategy of pretraining and finetuning. Experimenting with the WMT'21 multilingual translation task, we demonstrate that our systems outperform the conventional baselines of direct bilingual models and pivot translation models for most directions, averagely giving +6.0 and +4.1 BLEU, without the need for architecture change or extra data collection. Moreover, we also examine our proposed approach in an extremely large-scale data setting to accommodate practical deployment scenarios.
翻译:多语言神经机器翻译(MNMT)使一个系统能够将判决从多种源语言翻译成多种目标语言,大大降低与传统双语系统相比的部署费用。但是,MNMT培训的好处往往局限于许多到一个方向。该模型在一对多个和许多对多个,零弹射装置方面表现不佳。为解决这一问题,本文件讨论如何实际建立MNMT系统,为任意的X-Y翻译方向服务,同时利用多语种培训前和微调的两阶段培训战略来利用多语种翻译。实验WMT'21多语种翻译任务,我们证明我们的系统超越了大多数方向的双语直接模型和支流翻译模型的传统基线,平均提供+6.0和+4.1 BLEU,而不需要改变结构或收集额外数据。此外,我们还在极其大规模的数据环境中研究我们提出的办法,以适应实际部署的情景。