As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies. In many scenarios and particularly in cases of domain adaptation, one expects the MT output to adhere to the constraints provided by a terminology. In this work, we propose metrics to measure the consistency of MT output with regards to a domain terminology. We perform studies on the COVID-19 domain over 5 languages, also performing terminology-targeted human evaluation. We open-source the code for computing all proposed metrics: https://github.com/mahfuzibnalam/terminology_evaluation
翻译:随着神经机器翻译系统成为专业翻译管道的一个重要部分,越来越多的工作重点是将神经机器翻译与术语学相结合。在许多情况中,特别是在领域适应的情况下,人们期望MT输出符合术语所提供的限制。在这项工作中,我们提出了衡量MT输出与领域术语的一致性的衡量标准。我们用5种语言对COVID-19领域进行了研究,还进行了针对术语的人类评价。我们开源于计算所有拟议指标的代码:https://github.com/mahfuzubnalam/terminology_vale。我们用5种语言对COVID-19领域进行了研究。我们开发了计算所有拟议指标的代码:https://github.com/mahfuzubnalam/terminlogy_valement。