为密切相关的语文翻译极低资源 (Extremely low-resource machine translation for closely related languages)

An effective method to improve extremely low-resource neural machine translation is multilingual training, which can be improved by leveraging monolingual data to create synthetic bilingual corpora using the back-translation method. This work focuses on closely related languages from the Uralic language family: from Estonian and Finnish geographical regions. We find that multilingual learning and synthetic corpora increase the translation quality in every language pair for which we have data. We show that transfer learning and fine-tuning are very effective for doing low-resource machine translation and achieve the best results. We collected new parallel data for V\~oro, North and South Saami and present first results of neural machine translation for these languages.

翻译：提高极低资源神经机能翻译的有效方法是多语种培训,通过利用单语数据,利用反译法创建合成双语公司,可以改进这一培训。这项工作侧重于乌拉利语大家庭的密切相关的语言:来自爱沙尼亚和芬兰的地理区域。我们发现多语学习和合成公司提高了我们掌握数据的每种语文的翻译质量。我们表明,转让学习和微调对于进行低资源机器翻译和取得最佳结果非常有效。我们收集了V ⁇ ro、北萨米和南萨米语的新平行数据,并提供了这些语文的神经机翻译的第一结果。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日