Multilingual neural machine translation (MNMT) trained in multiple language pairs has attracted considerable attention due to fewer model parameters and lower training costs by sharing knowledge among multiple languages. Nonetheless, multilingual training is plagued by language interference degeneration in shared parameters because of the negative interference among different translation directions, especially on high-resource languages. In this paper, we propose the multilingual translation model with the high-resource language-specific training (HLT-MT) to alleviate the negative interference, which adopts the two-stage training with the language-specific selection mechanism. Specifically, we first train the multilingual model only with the high-resource pairs and select the language-specific modules at the top of the decoder to enhance the translation quality of high-resource directions. Next, the model is further trained on all available corpora to transfer knowledge from high-resource languages (HRLs) to low-resource languages (LRLs). Experimental results show that HLT-MT outperforms various strong baselines on WMT-10 and OPUS-100 benchmarks. Furthermore, the analytic experiments validate the effectiveness of our method in mitigating the negative interference in multilingual training.
翻译:以多种语文培训的多语言机器翻译(MNMT)由于模型参数较少,培训费用较低,多语言知识共享,因此引起了相当大的关注。然而,由于不同翻译方向,特别是高资源语言的负面干扰,多语言培训由于不同翻译方向之间的语言干扰,导致共享参数的退化,多语言培训受到语言干扰的困扰。在本文件中,我们提出多语言翻译模式,采用高资源语言专门培训(HLT-MT),以减轻负面干扰,这种干扰采用语言特定选择机制的两阶段培训。具体地说,我们首先对多语言模式进行培训,仅以高资源对应方为对象,并在解码器顶部选择语言特定模块,以提高高资源方向的翻译质量。接着,我们进一步培训该模式,将所有可用的组合从高资源语言向低资源语言(LLLs)转让知识。实验结果表明,HLT-MT超越了关于WMT-10和OPUS-100基准的各种强基线。此外,分析实验验证了我们减少多语言培训中负面干扰的方法的有效性。