In this paper, we present an extensive investigation of multi-bridge, many-to-many multilingual NMT models (MB-M2M) ie., models trained on non-English language pairs in addition to English-centric language pairs. In addition to validating previous work which shows that MB-M2M models can overcome zeroshot translation problems, our analysis reveals the following results about multibridge models: (1) it is possible to extract a reasonable amount of parallel corpora between non-English languages for low-resource languages (2) with limited non-English centric data, MB-M2M models are competitive with or outperform pivot models, (3) MB-M2M models can outperform English-Any models and perform at par with Any-English models, so a single multilingual NMT system can serve all translation directions.
翻译:在本文中,我们提出了对多桥、多至多种多语种NMT模式(MB-M2M)的广泛调查,这些模式除了以英语为中心的语言配对之外,还就非英语语言配对进行了培训。除了证实以往的工作表明MB-M2M模式能够克服零弹射翻译问题外,我们的分析还揭示了多桥模式的以下结果:(1) 有可能为低资源语言提取出非英语语言之间数量合理的平行体系(2) 有限非英语中心数据,MB-M2M模式与非英语中心模式具有竞争力,或优于外型支流模式,(3) MB-M2M模式能够超越英语模式,并和任何英语模式一样运作,因此单一的多语言NMT系统可以为所有翻译方向服务。