Unsupervised machine translation, which utilizes unpaired monolingual corpora as training data, has achieved comparable performance against supervised machine translation. However, it still suffers from data-scarce domains. To address this issue, this paper presents a novel meta-learning algorithm for unsupervised neural machine translation (UNMT) that trains the model to adapt to another domain by utilizing only a small amount of training data. We assume that domain-general knowledge is a significant factor in handling data-scarce domains. Hence, we extend the meta-learning algorithm, which utilizes knowledge learned from high-resource domains, to boost the performance of low-resource UNMT. Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores. Extensive experimental results show that our proposed algorithm is pertinent for fast adaptation and consistently outperforms other baseline models.
翻译:未经监督的机器翻译将未受监督的单一语言翻译作为培训数据,实现了与受监督的机器翻译的可比性能。然而,它仍然受到数据残缺领域的影响。为解决这一问题,本文件为无监督的神经机器翻译(UNMT)提供了一个新型的元学习算法,该算法仅利用少量培训数据来训练模型适应另一个领域。我们认为,一般领域知识是处理数据残缺领域的一个重要因素。因此,我们扩展了利用高资源领域知识的元学习算法,以提升低资源UNMT的性能。我们的模型超越了以转让学习为基础的方法,最多达2-4 BLEU分。广泛的实验结果显示,我们提议的算法对于快速适应和一贯优于其他基线模型具有相关性。