Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.
翻译:神经机翻译是最近提议的一种机器翻译方法。 与传统的统计机器翻译不同,神经机翻译的目的是建立一个单一的神经网络,可以联合调整,以最大限度地提高翻译性能。 最近提出的神经机翻译模型往往属于一个编码解码器,由将源句编码成固定长度矢量的编码器组成,从中解码器生成翻译。在本文中,我们推测使用固定长度矢量是改进这一基本编码解码结构的功能的一个瓶颈,并提议扩大这一范围,允许模型自动(软)搜索与预测目标字有关的源句部分,而不必将这些部分明确形成硬段。有了这一新的方法,我们实现了一种与现有的以英文对法文翻译任务最先进的语句系统相类似的翻译性能。此外,定性分析显示,模型发现的(软)比对准了我们的直觉。