与流动语音转换的跨语言语音语音语音语音转换促进改进读音 (Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation)

Nikolaos Ellinas,Georgios Vamvoukakis,Konstantinos Markopoulos,Georgia Maniati,Panos Kakoulidis,June Sig Sung,Inchul Hwang,Spyros Raptis,Aimilios Chalamandaris,Pirros Tsiakoulis

from arxiv, Submitted to ICASSP 2023

This paper presents a method for end-to-end cross-lingual text-to-speech (TTS) which aims to preserve the target language's pronunciation regardless of the original speaker's language. The model used is based on a non-attentive Tacotron architecture, where the decoder has been replaced with a normalizing flow network conditioned on the speaker identity, allowing both TTS and voice conversion (VC) to be performed by the same model due to the inherent linguistic content and speaker identity disentanglement. When used in a cross-lingual setting, acoustic features are initially produced with a native speaker of the target language and then voice conversion is applied by the same model in order to convert these features to the target speaker's voice. We verify through objective and subjective evaluations that our method can have benefits compared to baseline cross-lingual synthesis. By including speakers averaging 7.5 minutes of speech, we also present positive results on low-resource scenarios.

翻译：本文介绍了一种终端到终端跨语言文本到语音的方法,其目的是保护目标语言的发音,而不论原发言者的语言如何,所使用的模型基于非高级塔可坦结构,在这种结构中,解码器已被一个以发言者身份为条件的正常流网络所取代,使TTS和语音转换能够以同一模式进行,因为其固有的语言内容和语体特征分解。在跨语言环境中使用时,最初与目标语言的本地发言者一起制作声学特征,然后由同一模型应用声音转换,以便将这些特征转换成目标发言者的声音。我们通过客观和主观的评价核实,我们的方法与基线的跨语言合成相比,能够产生效益,通过将发言者平均7.5分钟的语音合成,我们还介绍了低资源情景的积极成果。

相关内容

语音合成

关注 491

语音合成（Speech Synthesis），也称为文语转换（Text-to-Speech, TTS,它是将任意的输入文本转换成自然流畅的语音输出。语音合成涉及到人工智能、心理学、声学、语言学、数字信号处理、计算机科学等多个学科技术，是信息处理领域中的一项前沿技术。随着计算机技术的不断提高，语音合成技术从早期的共振峰合成,逐步发展为波形拼接合成和统计参数语音合成，再发展到混合语音合成；合成语音的质量、自然度已经得到明显提高，基本能满足一些特定场合的应用需求。目前，语音合成技术在银行、医院等的信息播报系统、汽车导航系统、自动应答呼叫中心等都有广泛应用，取得了巨大的经济效益。另外，随着智能手机、MP3、PDA 等与我们生活密切相关的媒介的大量涌现，语音合成的应用也在逐渐向娱乐、语音教学、康复治疗等领域深入。可以说语音合成正在影响着人们生活的方方面面。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日