The process of translation is ambiguous, in that there are typically many valid trans- lations for a given sentence. This gives rise to significant variation in parallel cor- pora, however, most current models of machine translation do not account for this variation, instead treating the prob- lem as a deterministic process. To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to ac- count for local lexical and syntactic varia- tion in parallel corpora. We provide an in- depth analysis of the pitfalls encountered in variational inference for training deep generative models. Experiments on sev- eral different language pairs demonstrate that the model consistently improves over strong baselines.
翻译:翻译过程含糊不清,因为对某一句子通常有许多有效的转接,这在平行的cor-pora之间造成了巨大的差异,然而,目前大多数机器翻译模式并不考虑这种差异,而是将Prob-lem作为确定性过程。为此,我们提出了一个包含一系列潜在变量的深层次机器翻译基因化模型,以便在平行的Corsora对本地的词汇和合成变异进行计算。我们对培训深层基因化模型的变异推论中遇到的陷阱进行了深入的分析。对 sev-neal不同语言配对的实验表明,该模型在强大的基线上不断改进。