消除机器翻译评价中的不确定性 (Disentangling Uncertainty in Machine Translation Evaluation)

Trainable evaluation metrics for machine translation (MT) exhibit strong correlation with human judgements, but they are often hard to interpret and might produce unreliable scores under noisy or out-of-domain data. Recent work has attempted to mitigate this with simple uncertainty quantification techniques (Monte Carlo dropout and deep ensembles), however these techniques (as we show) are limited in several ways -- for example, they are unable to distinguish between different kinds of uncertainty, and they are time and memory consuming. In this paper, we propose more powerful and efficient uncertainty predictors for MT evaluation, and we assess their ability to target different sources of aleatoric and epistemic uncertainty. To this end, we develop and compare training objectives for the COMET metric to enhance it with an uncertainty prediction output, including heteroscedastic regression, divergence minimization, and direct uncertainty prediction. Our experiments show improved results on uncertainty prediction for the WMT metrics task datasets, with a substantial reduction in computational costs. Moreover, they demonstrate the ability of these predictors to address specific uncertainty causes in MT evaluation, such as low quality references and out-of-domain data.

翻译：机械翻译(MT)的可培训评价指标与人类判断密切相关,但往往很难解释,在吵闹或外表数据下可能产生不可靠的分数。最近的工作试图用简单的不确定性量化技术(蒙特卡洛辍学和深层集合)来缓解这一点,但这些技术(如我们所显示的)在若干方面是有限的 -- -- 例如,它们无法区分不同种类的不确定性,而且它们耗费时间和记忆。在本文件中,我们为MT评价提出了更强大和高效的不确定性预测,我们评估它们是否有能力针对不同来源的偏执和集中不确定性。为此,我们制定并比较了知识与技术的各项指标的培训目标,以便用不确定性预测产出,包括混凝土回归、差异最小化和直接不确定性预测。我们的实验显示,WMT指标任务数据集的不确定性预测结果有所改善,计算成本也大幅降低。此外,这些预测员有能力在MT评价中解决具体的不确定性原因,例如低质量引用和外部数据。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日