通过元学习为低资源域进行不受监督的神经机器翻译 (Unsupervised Neural Machine Translation for Low-Resource Domains via Meta-Learning)

Unsupervised machine translation, which utilizes unpaired monolingual corpora as training data, has achieved comparable performance against supervised machine translation. However, it still suffers from data-scarce domains. To address this issue, this paper presents a novel meta-learning algorithm for unsupervised neural machine translation (UNMT) that trains the model to adapt to another domain by utilizing only a small amount of training data. We assume that domain-general knowledge is a significant factor in handling data-scarce domains. Hence, we extend the meta-learning algorithm, which utilizes knowledge learned from high-resource domains, to boost the performance of low-resource UNMT. Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores. Extensive experimental results show that our proposed algorithm is pertinent for fast adaptation and consistently outperforms other baseline models.

翻译：未经监督的机器翻译将未受监督的单一语言翻译作为培训数据,实现了与受监督的机器翻译的可比性能。然而,它仍然受到数据残缺领域的影响。为解决这一问题,本文件为无监督的神经机器翻译(UNMT)提供了一个新型的元学习算法,该算法仅利用少量培训数据来训练模型适应另一个领域。我们认为,一般领域知识是处理数据残缺领域的一个重要因素。因此,我们扩展了利用高资源领域知识的元学习算法,以提升低资源UNMT的性能。我们的模型超越了以转让学习为基础的方法,最多达2-4 BLEU分。广泛的实验结果显示,我们提议的算法对于快速适应和一贯优于其他基线模型具有相关性。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日