Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated. However, while many current methods present model architectures that theoretically can use this extra context, it is often not clear how much they do actually utilize it at translation time. In this paper, we introduce a new metric, conditional cross-mutual information, to quantify the usage of context by these models. Using this metric, we measure how much document-level machine translation systems use particular varieties of context. We find that target context is referenced more than source context, and that conditioning on a longer context has a diminishing effect on results. We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models. Experiments show that our method increases context usage and that this reflects on the translation quality according to metrics such as BLEU and COMET, as well as performance on anaphoric pronoun resolution and lexical cohesion contrastive datasets.
翻译:神经机翻译的近期工作表明,使用跨理论背景的必要性和可行性 -- -- 与目前正在翻译的句子不同的语境。然而,虽然许多现行方法都呈现出理论上可以使用这种额外语境的模型结构,但通常还不清楚这些模型在翻译时实际使用了多少。在本文件中,我们引入了一种新的衡量标准,即有条件的跨相互信息,以量化这些模型对上下文的使用情况。使用这一指标,我们测量了文件级别的机器翻译系统使用特定语境品种的数量。我们发现,目标背景的引用多于来源背景,对更长的语境的制约对结果的影响越来越小。我们随后引入了一种新的简单培训方法,即从上下文中识别的词退出,以增加上下文模型的使用量。实验表明,我们的方法提高了上下文的使用量,这反映了如 BLEU 和 COET 等参数的翻译质量,以及亚光谱分解和词义一致性对比数据集的性能。