对跨语文对称的机器翻译进行一致的人力评价 (Consistent Human Evaluation of Machine Translation across Language Pairs)

Obtaining meaningful quality scores for machine translation systems through human evaluation remains a challenge given the high variability between human evaluators, partly due to subjective expectations for translation quality for different language pairs. We propose a new metric called XSTS that is more focused on semantic equivalence and a cross-lingual calibration method that enables more consistent assessment. We demonstrate the effectiveness of these novel contributions in large scale evaluation studies across up to 14 language pairs, with translation both into and out of English.

翻译：鉴于人类评价人员之间差异很大,通过人力评价人员为机器翻译系统取得有意义的质量评分仍是一项挑战,部分原因是对不同语文对口翻译质量的主观期望。我们建议采用新的标准,即更注重语义等同的XSTS和一种能够进行更一致评估的跨语言校准方法。我们展示了这些新贡献在涉及多达14种语文对口的大规模评价研究中的有效性,包括英文译文和英文译文译文。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日