While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively exposed to large amounts of parallel corpora, our learners (implemented as encoder-decoder architecture) engage in cooperative image description games, and thus develop their own image captioning or neural machine translation model from the need to communicate in order to succeed at the game. Experimental results on the IAPR-TC12 and Multi30K datasets show that the proposed learning mechanism significantly improves over the state-of-the-art methods.
翻译:过去几年来,端到端神经机器翻译(NMT)在翻译少数资源丰富的语言配对方面取得了显著成功,但它仍然受到低资源语言配对和领域数据稀缺问题的影响。为了解决这一问题,我们提议为零资源神经机器翻译建立一个互动式多式框架。 我们的学习者(作为编码器解码器结构实施)不是被动地接触大量平行体,而是参与合作性图像描述游戏,从而开发出他们自己的图像字幕或神经机器翻译模型,以适应交流的需要,从而在游戏中取得成功。 IARPR-TC12和Multual30K数据集的实验结果表明,拟议的学习机制大大改进了最先进的方法。