Being able to rank the similarity of short text segments is an interesting bonus feature of neural machine translation. Translation-based similarity measures include direct and pivot translation probability, as well as translation cross-likelihood, which has not been studied so far. We analyze these measures in the common framework of multilingual NMT, releasing the NMTScore library (available at https://github.com/ZurichNLP/nmtscore). Compared to baselines such as sentence embeddings, translation-based measures prove competitive in paraphrase identification and are more robust against adversarial or multilingual input, especially if proper normalization is applied. When used for reference-based evaluation of data-to-text generation in 2 tasks and 17 languages, translation-based measures show a relatively high correlation to human judgments.
翻译:能够对短文本段的相似性进行排序是神经机器翻译的一个令人感兴趣的奖励性特征。基于翻译的类似性措施包括直接和主轴翻译概率,以及翻译跨类似性,迄今为止尚未对此进行过研究。我们在多语种NMT的共同框架内分析这些措施,释放NMTScore图书馆(见https://github.com/ZlexinNLP/nmtscore)。与诸如嵌入句子等基线相比,基于翻译的措施在语音识别方面证明具有竞争力,而且对于对抗性或多语种输入更为有力,特别是如果应用适当的常规化。当用于对2项任务和17种语言的数据-文字生成进行基于参考的评估时,基于翻译的措施与人类判断的相关性相对较高。