In Machine Translation, assessing the quality of a large amount of automatic translations can be challenging. Automatic metrics are not reliable when it comes to high performing systems. In addition, resorting to human evaluators can be expensive, especially when evaluating multiple systems. To overcome the latter challenge, we propose a novel application of online learning that, given an ensemble of Machine Translation systems, dynamically converges to the best systems, by taking advantage of the human feedback available. Our experiments on WMT'19 datasets show that our online approach quickly converges to the top-3 ranked systems for the language pairs considered, despite the lack of human feedback for many translations.
翻译:在机器翻译中,评估大量自动翻译的质量可能具有挑战性。在高性能系统方面,自动衡量标准并不可靠。此外,使用人力评估员可能费用昂贵,特别是在评价多个系统时。为了克服后一种挑战,我们提议采用新的在线学习方法,根据机器翻译系统的组合,利用现有人类反馈,动态地与最佳系统汇合。我们对WMT'19数据集的实验显示,尽管许多翻译缺乏人类反馈,但我们的在线方法很快会与所考虑的语文对口排名前3位的系统汇合。