通过多种因素脱钩的多声音语音语音合成</s> (Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling) - 专知论文

会员服务 ·

0

分解的 · 语音合成 · Extensibility · binary · INFORMS ·

2023 年 3 月 14 日

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling

翻译：通过多种因素脱钩的多声音语音语音合成

Xinfa Zhu,Yi Lei,Kun Song,Yongmao Zhang,Tao Li,Lei Xie

from arxiv, Accepted by ICASSP2023

This paper aims to synthesize the target speaker's speech with desired speaking style and emotion by transferring the style and emotion from reference speech recorded by other speakers. We address this challenging problem with a two-stage framework composed of a text-to-style-and-emotion (Text2SE) module and a style-and-emotion-to-wave (SE2Wave) module, bridging by neural bottleneck (BN) features. To further solve the multi-factor (speaker timbre, speaking style and emotion) decoupling problem, we adopt the multi-label binary vector (MBV) and mutual information (MI) minimization to respectively discretize the extracted embeddings and disentangle these highly entangled factors in both Text2SE and SE2Wave modules. Moreover, we introduce a semi-supervised training strategy to leverage data from multiple speakers, including emotion-labeled data, style-labeled data, and unlabeled data. To better transfer the fine-grained expression from references to the target speaker in non-parallel transfer, we introduce a reference-candidate pool and propose an attention-based reference selection approach. Extensive experiments demonstrate the good design of our model.

翻译：本文旨在通过将其他发言者所录参考演讲的风格和情感从其他发言者所录参考演讲中传递的风格和情感,将目标发言者的演讲与理想的演讲风格和情感综合起来。我们用由文本到风格和情绪(Text2SE)模块组成的两阶段框架和由神经瓶颈(BN)特征连接的风格和情绪(SE2Wave)模块组成的风格和情绪到波(SE2Wave)模块解决这一具有挑战性的问题。为了进一步解决多因素(声音的触角、声音的风格和情感)脱钩问题,我们采用了多标签双向二向矢量和相互信息(MBV)的最小化,分别将提取的嵌入器分离并分解这些高度纠结的因素。此外,我们引入了半封闭式的培训战略,以利用多个发言者的数据,包括情感标签数据、风格标签数据和无标签数据。为了更好地将精细的表达式表达式从非平行传输中引用目标演讲者,我们引入了一个参考扫描式的组合,我们引入了一种参考扫描式的模型,并提议一个广泛的实验式的模型。</s>

0

相关内容

分解的

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

脊髓GRK2/Epac1调控小胶质细胞表型转化在电针缓解慢性痛中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

胶质瘤c-myc/miR-27a靶向SFRP1调控Wnt/β-catenin通路新机制

国家自然科学基金

0+阅读 · 2013年12月31日

miR-146a靶向IRAK1与TRAF6调控非小细胞肺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

β-arrestin 2负调控TLR9信号抑制肺腺癌的增殖与转移

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNA HOTTIP参与小细胞肺癌耐药的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

能量临界情形的非线性Schrodinger方程

国家自然科学基金

0+阅读 · 2011年12月31日

多重刺激响应的纤维素基接枝共聚物结构与响应性

国家自然科学基金

0+阅读 · 2011年12月31日

Credibility of high $R^2$ in regression problems: a permutation approach

Arxiv

0+阅读 · 2023年5月4日

Semantically Structured Image Compression via Irregular Group-Based Decoupling

Arxiv

0+阅读 · 2023年5月4日

A Cross-direction Task Decoupling Network for Small Logo Detection

Arxiv

0+阅读 · 2023年5月4日

Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge

Arxiv

0+阅读 · 2023年5月3日

Flexible Covariate Adjustments in Regression Discontinuity Designs

Arxiv

0+阅读 · 2023年5月3日

Egocentric Audio-Visual Noise Suppression

Arxiv

0+阅读 · 2023年5月3日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

Arxiv

10+阅读 · 2020年12月31日

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Arxiv

14+阅读 · 2020年3月10日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Credibility of high $R^2$ in regression problems: a permutation approach

Arxiv

0+阅读 · 2023年5月4日

Semantically Structured Image Compression via Irregular Group-Based Decoupling

Arxiv

0+阅读 · 2023年5月4日

A Cross-direction Task Decoupling Network for Small Logo Detection

Arxiv

0+阅读 · 2023年5月4日

Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge

Arxiv

0+阅读 · 2023年5月3日

Flexible Covariate Adjustments in Regression Discontinuity Designs

Arxiv

0+阅读 · 2023年5月3日

Egocentric Audio-Visual Noise Suppression

Arxiv

0+阅读 · 2023年5月3日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

Arxiv

10+阅读 · 2020年12月31日

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Arxiv

14+阅读 · 2020年3月10日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

脊髓GRK2/Epac1调控小胶质细胞表型转化在电针缓解慢性痛中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

胶质瘤c-myc/miR-27a靶向SFRP1调控Wnt/β-catenin通路新机制

国家自然科学基金

0+阅读 · 2013年12月31日

miR-146a靶向IRAK1与TRAF6调控非小细胞肺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

β-arrestin 2负调控TLR9信号抑制肺腺癌的增殖与转移

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNA HOTTIP参与小细胞肺癌耐药的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

能量临界情形的非线性Schrodinger方程

国家自然科学基金

0+阅读 · 2011年12月31日

多重刺激响应的纤维素基接枝共聚物结构与响应性

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员