In this report, we present our submission to the WMT 2022 Metrics Shared Task. We build our system based on the core idea of UNITE (Unified Translation Evaluation), which unifies source-only, reference-only, and source-reference-combined evaluation scenarios into one single model. Specifically, during the model pre-training phase, we first apply the pseudo-labeled data examples to continuously pre-train UNITE. Notably, to reduce the gap between pre-training and fine-tuning, we use data cropping and a ranking-based score normalization strategy. During the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Specially, we collect the results from models with different pre-trained language model backbones, and use different ensembling strategies for involved translation directions.
翻译:在本报告中,我们向WMT 2022 Metrics Common 任务提交我们的意见,我们根据UNITE(统一翻译评价)的核心理念建立我们的系统,将单一来源、仅参考和源参考综合评价设想统一成单一模型,具体地说,在培训前示范阶段,我们首先将假标签数据实例用于持续预培训UNITE。值得注意的是,为了缩小培训前和微调之间的差距,我们使用数据裁剪和基于排名的得分正常化战略。在微调阶段,我们既使用过去几年WMT竞赛的直接评估(DA)数据,也使用多层面质量计量(MQM)数据。特别是,我们收集了具有不同经过培训的语言模型骨干模型的模型结果,并使用不同的组合战略用于相关的翻译方向。