Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process. In this paper, we propose Lyrics-Melody Translation with Adaptive Grouping (LTAG), a holistic solution to automatic song translation by jointly modeling lyrics translation and lyrics-melody alignment. It is a novel encoder-decoder framework that can simultaneously translate the source lyrics and determine the number of aligned notes at each decoding step through an adaptive note grouping module. To address data scarcity, we commissioned a small amount of training data annotated specifically for this task and used large amounts of augmented data through back-translation. Experiments conducted on an English-Chinese song translation data set show the effectiveness of our model in both automatic and human evaluation.
翻译:歌曲翻译需要翻译歌词并将音符对齐以便与伴奏旋律一起演唱,这是一个挑战性问题,引起了不同翻译领域的兴趣。本文提出了一种名为自适应分组歌词旋律翻译(LTAG)的全面解决方案,该方案通过联合建模尝试自动解决歌词翻译和歌词-旋律对齐。它是一种新颖的编码器-解码器框架,可以同时翻译源歌词并通过自适应音符分组确定每个解码步骤的对齐音符数量。为了解决数据稀缺问题,我们特定地委托了一小部分训练数据进行注释,并通过回译使用大量扩增数据。在一份英汉歌曲翻译数据集上进行的实验表明,我们的模型在自动评估和人工评估方面都很有效。