表达式、非注意性语音合成词级风格控制 (Word-Level Style Control for Expressive, Non-attentive Speech Synthesis)

This paper presents an expressive speech synthesis architecture for modeling and controlling the speaking style at a word level. It attempts to learn word-level stylistic and prosodic representations of the speech data, with the aid of two encoders. The first one models style by finding a combination of style tokens for each word given the acoustic features, and the second outputs a word-level sequence conditioned only on the phonetic information in order to disentangle it from the style information. The two encoder outputs are aligned and concatenated with the phoneme encoder outputs and then decoded with a Non-Attentive Tacotron model. An extra prior encoder is used to predict the style tokens autoregressively, in order for the model to be able to run without a reference utterance. We find that the resulting model gives both word-level and global control over the style, as well as prosody transfer capabilities.

翻译：本文展示了用于在单词级别上建模和控制语音样式的表达式语音合成结构。它试图在两个编码器的帮助下学习语音数据的字级文体和预示表达式。第一个模型样式, 找到每个单词的样式符号组合, 以音频特性为条件, 第二个输出单词级序列仅以音频信息为条件, 以便与风格信息脱钩。两个编码器输出与语音编码器编码器输出相匹配, 然后与非惯性调制调子调解。一个额外的前编码器被用于自动预测样式符号, 以便模型能够在不引用语句的情况下运行。我们发现, 生成的模型既提供了文字级别, 也提供了对样式的全局控制, 以及 prosody 传输能力。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

多媒体顶会ACM Multimedia 2021各大奖项出炉！北航获最佳论文，NTU获最佳学生论文

专知会员服务

15+阅读 · 2021年10月23日

从多个自我监督任务中学习问题无关的语音表示，Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

专知会员服务

17+阅读 · 2020年5月6日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日