StarGANV2-VC: 自然声音转换的多样化、不受监督、无平行框架 (StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion)

We present an unsupervised non-parallel many-to-many voice conversion (VC) method using a generative adversarial network (GAN) called StarGAN v2. Using a combination of adversarial source classifier loss and perceptual loss, our model significantly outperforms previous VC models. Although our model is trained only with 20 English speakers, it generalizes to a variety of voice conversion tasks, such as any-to-many, cross-lingual, and singing conversion. Using a style encoder, our framework can also convert plain reading speech into stylistic speech, such as emotional and falsetto speech. Subjective and objective evaluation experiments on a non-parallel many-to-many voice conversion task revealed that our model produces natural sounding voices, close to the sound quality of state-of-the-art text-to-speech (TTS) based voice conversion methods without the need for text labels. Moreover, our model is completely convolutional and with a faster-than-real-time vocoder such as Parallel WaveGAN can perform real-time voice conversion.

翻译：我们使用称为StarGAN v2的基因对抗网络(GAN), 使用对抗源分类器损失和感官损失的组合,我们的模式大大优于先前的VC模式。虽然我们的模型只受过20个英语语言的培训,但被概括为多种声音转换任务,如任何到many、跨语言和歌唱转换。我们的框架还可以将普通读音转换成文体语言,如情感和假话。对非平行源分类器损失和感官损失的主观和客观评估实验显示,我们的模型产生自然声音,接近于以最先进的文本到语音(TTTS)为基础的声音转换方法的音质,而不需要文字标签。此外,我们的模型是完全革命性的,而且具有比实时更快的语音转换,例如平行的WaveGAN能够进行实时语音转换。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

最新《生成式对抗网络》简介，25页ppt

专知会员服务

175+阅读 · 2020年6月28日

【伯克利】最新《生成式对抗网络》技术综述课程，257页ppt带你学习GAN进展

专知会员服务

193+阅读 · 2020年5月3日

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日