计量和增加背景软件翻译的使用 (Measuring and Increasing Context Usage in Context-Aware Machine Translation)

Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated. However, while many current methods present model architectures that theoretically can use this extra context, it is often not clear how much they do actually utilize it at translation time. In this paper, we introduce a new metric, conditional cross-mutual information, to quantify the usage of context by these models. Using this metric, we measure how much document-level machine translation systems use particular varieties of context. We find that target context is referenced more than source context, and that conditioning on a longer context has a diminishing effect on results. We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models. Experiments show that our method increases context usage and that this reflects on the translation quality according to metrics such as BLEU and COMET, as well as performance on anaphoric pronoun resolution and lexical cohesion contrastive datasets.

翻译：神经机翻译的近期工作表明,使用跨理论背景的必要性和可行性 -- -- 与目前正在翻译的句子不同的语境。然而,虽然许多现行方法都呈现出理论上可以使用这种额外语境的模型结构,但通常还不清楚这些模型在翻译时实际使用了多少。在本文件中,我们引入了一种新的衡量标准,即有条件的跨相互信息,以量化这些模型对上下文的使用情况。使用这一指标,我们测量了文件级别的机器翻译系统使用特定语境品种的数量。我们发现,目标背景的引用多于来源背景,对更长的语境的制约对结果的影响越来越小。我们随后引入了一种新的简单培训方法,即从上下文中识别的词退出,以增加上下文模型的使用量。实验表明,我们的方法提高了上下文的使用量,这反映了如 BLEU 和 COET 等参数的翻译质量,以及亚光谱分解和词义一致性对比数据集的性能。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日