Neural machine translation requires large amounts of parallel training text to learn a reasonable-quality translation model. This is particularly inconvenient for language pairs for which enough parallel text is not available. In this paper, we use monolingual linguistic resources in the source side to address this challenging problem based on a multi-task learning approach. More specifically, we scaffold the machine translation task on auxiliary tasks including semantic parsing, syntactic parsing, and named-entity recognition. This effectively injects semantic and/or syntactic knowledge into the translation model, which would otherwise require a large amount of training bitext. We empirically evaluate and show the effectiveness of our multi-task learning approach on three translation tasks: English-to-French, English-to-Farsi, and English-to-Vietnamese.
翻译:神经机器翻译需要大量的平行培训文本,以学习一个合理质量的翻译模式。 对于没有足够平行文本的对口语言来说,这尤其不方便。 在本文中,我们使用来源方的单语语言资源来解决基于多任务学习方法的这一具有挑战性的问题。 更具体地说,我们把机器翻译任务放在辅助任务上,包括语义分解、合成法和名称实体识别。 这有效地将语义和/或合成知识注入翻译模式,否则将需要大量的培训。 我们从经验上评估和展示了我们多任务学习方法在三种翻译任务上的有效性:英语对法语、英语对法语、英语对法语和英语对越南语。