Lexically constrained machine translation allows the user to manipulate the output sentence by enforcing the presence or absence of certain words and phrases. Although current approaches can enforce terms to appear in the translation, they often struggle to make the constraint word form agree with the rest of the generated output. Our manual analysis shows that 46% of the errors in the output of a baseline constrained model for English to Czech translation are related to agreement. We investigate mechanisms to allow neural machine translation to infer the correct word inflection given lemmatized constraints. In particular, we focus on methods based on training the model with constraints provided as part of the input sequence. Our experiments on the English-Czech language pair show that this approach improves the translation of constrained terms in both automatic and manual evaluation by reducing errors in agreement. Our approach thus eliminates inflection errors, without introducing new errors or decreasing the overall quality of the translation.
翻译:严格限制的机器翻译使用户能够通过强制执行某些单词和短语的存在或不存在来操作输出句。 虽然目前的方法可以强制在翻译中出现术语, 但通常会很难使约束词格式与生成输出的其余部分一致。 我们的人工分析显示,46%的英文和捷克翻译基准限制模式产出错误与协议有关。 我们调查允许神经机翻译根据血压限制推断正确词折的机制。 特别是,我们侧重于基于该模型的培训方法,在输入序列中提供了限制。 我们对英语和捷克语的实验显示,这一方法通过减少协议错误,改进了自动和手工评价中限制词的翻译。 因此,我们的方法消除了错误,没有引入新的错误,也没有降低翻译的整体质量。