Chinese dialects are different variations of Chinese and can be considered as different languages in the same language family with Mandarin. Though they all use Chinese characters, the pronunciations, grammar and idioms can vary significantly, and even local speakers may find it hard to input correct written forms of dialect. Besides, using Mandarin text as text-to-speech inputs would generate speech with poor naturalness. In this paper, we propose a novel Chinese dialect TTS frontend with a translation module, which converts Mandarin text into dialectic expressions to improve the intelligibility and naturalness of synthesized speech. A non-autoregressive neural machine translation model with various tricks is proposed for the translation task. It is the first known work to incorporate translation with TTS frontend. Experiments on Cantonese show the proposed model improves 2.56 BLEU and TTS improves 0.27 MOS with Mandarin inputs.
翻译:中文方言是中文的不同变式,在普通话的同一种语言中可被视为不同语言。虽然中文的语种都使用中文字符,但语法、语法和语法可能大不相同,甚至当地语者也难以输入正确的方言书面形式。此外,使用中文文本作为文字对语言的输入将产生语言,其自然性差。在本文中,我们建议使用中国方言TTTS前端配一个翻译模块,将曼达林语文本转换为辩语表达方式,以提高合成语言的智能和自然性。为翻译工作建议了一种具有各种诀窍的非上性神经机器翻译模式。这是首次使用TS前端翻译的已知工作。在广东边实验中显示,拟议模式改进了2.56 BLEU和TTS,用曼达林语输入改进了0.27 MOS。