This report presents an automatic evaluation of the general machine translation task of the Seventh Conference on Machine Translation (WMT22). It evaluates a total of 185 systems for 21 translation directions including high-resource to low-resource language pairs and from closely related to distant languages. This large-scale automatic evaluation highlights some of the current limits of state-of-the-art machine translation systems. It also shows how automatic metrics, namely chrF, BLEU, and COMET, can complement themselves to mitigate their own limits in terms of interpretability and accuracy.
翻译:本报告对第七次机器翻译会议(WMT22)的一般机器翻译任务进行自动评价,共评价了21个翻译方向的185个系统,包括向低资源语言配对提供高资源,以及与远方语言密切相关的21个翻译方向的185个系统,这一大规模自动评价突出了目前最先进的机器翻译系统的一些限制,还说明了ChrF、BLEU和知识与技术伦理学委员会等自动计量系统如何可以相互补充,以降低其在可解释性和准确性方面的限制。