BERTScore is an effective and robust automatic metric for referencebased machine translation evaluation. In this paper, we incorporate multilingual knowledge graph into BERTScore and propose a metric named KG-BERTScore, which linearly combines the results of BERTScore and bilingual named entity matching for reference-free machine translation evaluation. From the experimental results on WMT19 QE as a metric without references shared tasks, our metric KG-BERTScore gets higher overall correlation with human judgements than the current state-of-the-art metrics for reference-free machine translation evaluation.1 Moreover, the pre-trained multilingual model used by KG-BERTScore and the parameter for linear combination are also studied in this paper.
翻译:BERTScore是参考机器翻译评价的有效和稳健的自动衡量标准。在本文中,我们将多语种知识图表纳入BERTScore,并提出一个名为KG-BERTScore的计量标准,该计量标准将BERTScore和双语命名实体匹配的无参考机器翻译评价结果进行线性合并。从WMT19 QE作为衡量标准的实验结果中得出,而没有参考标准分担任务,我们的通用KG-BERTScore与人类判断的总体相关性高于目前最新的无参考机器翻译评价标准。 1 此外,本文件还研究了KG-BERTScore使用的预先训练的多语种模型和线性组合参数。