Speech-to-speech translation combines machine translation with speech synthesis, introducing evaluation challenges not present in either task alone. How to automatically evaluate speech-to-speech translation is an open question which has not previously been explored. Translating to speech rather than to text is often motivated by unwritten languages or languages without standardized orthographies. However, we show that the previously used automatic metric for this task is best equipped for standardized high-resource languages only. In this work, we first evaluate current metrics for speech-to-speech translation, and second assess how translation to dialectal variants rather than to standardized languages impacts various evaluation methods.
翻译:语音对语音翻译将机器翻译与语音合成结合起来,引入评价挑战,这在任何一个任务中都不是唯一的问题。 如何自动评价语音对语音翻译是一个尚未探讨过的未决问题。 语音对语音翻译的转换往往受不成文语言或没有标准化拼图的语言的驱动。 然而,我们显示,以前用于这项任务的自动衡量标准仅最适于标准高资源语言。 在这项工作中,我们首先评估语音对语音翻译的现有衡量标准,其次是评估译为方言变方言而不是标准化语言如何影响各种评价方法。