Most studies on word-level Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. To overcome these problems, we explore different approaches to multilingual, word-level QE. We show that these QE models perform on par with the current language-specific models. In the cases of zero-shot and few-shot QE, we demonstrate that it is possible to accurately predict word-level quality for any given new language pair from models trained on other language pairs. Our findings suggest that the word-level QE models based on powerful pre-trained transformers that we propose in this paper generalise well across languages, making them more useful in real-world scenarios.
翻译:多数关于机器翻译的字级质量估计(QE)的研究侧重于特定语言模式,这些方法的明显缺点是需要每种语言配对的贴标签数据,以及维持若干特定语言模式所需的高成本。为了解决这些问题,我们探索了多种语言、字级质量评估的不同方法。我们表明,这些质量评估模式与当前特定语言模式的表现相当。在零点和微点的QE中,我们证明,从其他语言配对培训的模型中准确地预测任何特定新语言配对的字级质量是可能的。我们的调查结果表明,基于我们在本文件中建议的强力、经过培训的变压器的字级质量模型非常概括各种语言,使其在现实世界情景中更加有用。