The Neural Machine Translation (NMT) model is essentially a joint language model conditioned on both the source sentence and partial translation. Therefore, the NMT model naturally involves the mechanism of the Language Model (LM) that predicts the next token only based on partial translation. Despite its success, NMT still suffers from the hallucination problem, generating fluent but inadequate translations. The main reason is that NMT pays excessive attention to the partial translation while neglecting the source sentence to some extent, namely overconfidence of the LM. Accordingly, we define the Margin between the NMT and the LM, calculated by subtracting the predicted probability of the LM from that of the NMT model for each token. The Margin is negatively correlated to the overconfidence degree of the LM. Based on the property, we propose a Margin-based Token-level Objective (MTO) and a Margin-based Sentencelevel Objective (MSO) to maximize the Margin for preventing the LM from being overconfident. Experiments on WMT14 English-to-German, WMT19 Chinese-to-English, and WMT14 English-to-French translation tasks demonstrate the effectiveness of our approach, with 1.36, 1.50, and 0.63 BLEU improvements, respectively, compared to the Transformer baseline. The human evaluation further verifies that our approaches improve translation adequacy as well as fluency.
翻译:神经机器翻译模式(NMT)基本上是一个以源句和部分翻译为条件的共同语言模式,因此,NMT模式自然涉及语言模型机制,它预测下一象征只是部分翻译。尽管取得了成功,但NMT仍然患有幻觉问题,产生流利但翻译不足。主要原因是,NMT过度关注部分翻译,而在某种程度上忽略了源句,即过度信任LM。因此,我们界定了NMT和LMM之间的边界,计算方法是从NMT模式中减去每个象征的预计 LM概率。Margin与LM的过度信任程度有负相关。基于属性,我们提议基于玛金的调级目标(MTO)和基于边阶的句级目标(MSO),以最大限度地提高 Margin,防止LM过分自信。我们从英文到德的MT14、WMT19、中文到英文、WMT14和WMT14的L。我们分别改进了英语到法文版本1的翻译质量。