In neural machine translation, a source sequence of words is encoded into a vector from which a target sequence is generated in the decoding phase. Differently from statistical machine translation, the associations between source words and their possible target counterparts are not explicitly stored. Source and target words are at the two ends of a long information processing procedure, mediated by hidden states at both the source encoding and the target decoding phases. This makes it possible that a source word is incorrectly translated into a target word that is not any of its admissible equivalent counterparts in the target language. In this paper, we seek to somewhat shorten the distance between source and target words in that procedure, and thus strengthen their association, by means of a method we term bridging source and target word embeddings. We experiment with three strategies: (1) a source-side bridging model, where source word embeddings are moved one step closer to the output target sequence; (2) a target-side bridging model, which explores the more relevant source word embeddings for the prediction of the target sequence; and (3) a direct bridging model, which directly connects source and target word embeddings seeking to minimize errors in the translation of ones by the others. Experiments and analysis presented in this paper demonstrate that the proposed bridging models are able to significantly improve quality of both sentence translation, in general, and alignment and translation of individual source words with target words, in particular.
翻译:在神经机翻译中,将源词序列编码成矢量,由此在解码阶段产生目标序列。与统计机翻译不同,源词与其可能的目标对应方之间的关联没有被明确储存。源词和目标字处于长期信息处理程序的两端,在源编码和目标解码阶段由隐藏的国家进行介介质。这使得有可能将源词错误地转换成目标词,而不是目标语言中任何可接受的对应方。在本文中,我们力求略微缩短该程序中源词与目标词之间的距离,从而通过我们用“连接源词”和“嵌入目标词”的方法加强它们之间的联系。我们试验了三种战略:(1) 源词连接模式,将源词嵌入到源码和目标解码阶段的一步更接近输出目标序列;(2) 目标端连接模型,探索更相关的源词嵌入目标序列的任何一个对应方词;(3) 直接连接模型,将源词和目标词嵌入该程序中的来源和目标单词之间的距离,从而通过一种方法加强它们之间的联系,即我们用“连接源”和“嵌入点字”来尽可能减少“嵌入”的翻译中,我们提出的“主判”中,通过其他实验显示“总的翻译中“的“的”的“格式”的“和”的“方向”的翻译。