使用非帕拉语音转换器进行低资源文字到语音的跨语音情感传输,并增加 Pitch-Shift 数据 (Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation) - 专知论文

会员服务 ·

0

数据增强 · 语音合成 · MoDELS · 相似度 · 学成 ·

2022 年 4 月 21 日

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

翻译：使用非帕拉语音转换器进行低资源文字到语音的跨语音情感传输,并增加 Pitch-Shift 数据

Ryo Terashima,Ryuichi Yamamoto,Eunwoo Song,Yuma Shirahata,Hyun-Wook Yoon,Jae-Min Kim,Kentaro Tachibana

from arxiv, Submitted to Interspeech 2022

Data augmentation via voice conversion (VC) has been successfully applied to low-resource expressive text-to-speech (TTS) when only neutral data for the target speaker are available. Although the quality of VC is crucial for this approach, it is challenging to learn a stable VC model because the amount of data is limited in low-resource scenarios, and highly expressive speech has large acoustic variety. To address this issue, we propose a novel data augmentation method that combines pitch-shifting and VC techniques. Because pitch-shift data augmentation enables the coverage of a variety of pitch dynamics, it greatly stabilizes training for both VC and TTS models, even when only 1,000 utterances of the target speaker's neutral data are available. Subjective test results showed that a FastSpeech 2-based emotional TTS system with the proposed method improved naturalness and emotional similarity compared with conventional methods.

翻译：在只有目标发言者的中性数据的情况下,通过语音转换(VC)增加数据的做法已成功地应用于低资源表达式文本到语音(TTS),只有目标发言者的中性数据。虽然VC的质量对这种方法至关重要,但是要学习稳定的VC模式却具有挑战性,因为低资源情景下的数据数量有限,而且高度表达式的言论具有巨大的声学多样性。为了解决这一问题,我们提议了一种新的数据增强方法,将投放式转换和VC技术结合起来。由于投放式数据增强能够覆盖各种投放动态,它大大稳定了对VC和TTS模型的培训,即使只有1 000个目标发言者的中性数据的发音。主观测试结果表明,快速语音2基情感TTS系统与拟议方法的自然性和与传统方法的情感相似性得到了改善。

0

相关内容

数据增强

数据增强在机器学习领域多指采用一些方法（比如数据蒸馏，正负样本均衡等）来提高模型数据集的质量，增强数据。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

GPC3调控肝细胞癌侵袭和转移的信号通路研究

国家自然科学基金

0+阅读 · 2014年12月31日

缺氧肿瘤微环境在促进CD133+CXCR4-结肠癌细胞上皮间质转化中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土氧化物修饰三维介孔镍基催化剂对SOFC阳极抗积碳和强度研究

国家自然科学基金

0+阅读 · 2013年12月31日

Haccpper环境中不锈钢表面活性与电化学噪声特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

太阳大气中激波与冕环相互作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

呼伦贝尔羊草草甸草原CO2、CH4 和N2O 通量对放牧强度的响应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

CX3CL1/CX3CR1相互作用调控低氧前列腺癌细胞转移的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Fast Deep Autoencoder for Federated learning

Fast Deep Autoencoder for Federated learning

Arxiv

1+阅读 · 2022年6月10日

Speak Like a Dog: Human to Non-human creature Voice Conversion

Arxiv

0+阅读 · 2022年6月9日

On the Parameter Combinations That Matter and on Those That do Not

Arxiv

0+阅读 · 2022年6月9日

OOD Augmentation May Be at Odds with Open-Set Recognition

Arxiv

0+阅读 · 2022年6月9日

Perceptual Quality Assessment for Fine-Grained Compressed Images

Arxiv

0+阅读 · 2022年6月8日

Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Arxiv

0+阅读 · 2022年6月7日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《军事域人工智能风险、机遇与治理战略指导报告》2025最新76页报告

《杀伤网与精确规模：智能饱和战争时代的战略要务-印度视角》2025最新报告

俄乌冲突的地缘政治与军事教训（万字长文）

《弹药快速效能建模：推进互操作性与技术优势》2025最新26页报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Fast Deep Autoencoder for Federated learning

Fast Deep Autoencoder for Federated learning

Arxiv

1+阅读 · 2022年6月10日

Speak Like a Dog: Human to Non-human creature Voice Conversion

Arxiv

0+阅读 · 2022年6月9日

On the Parameter Combinations That Matter and on Those That do Not

Arxiv

0+阅读 · 2022年6月9日

OOD Augmentation May Be at Odds with Open-Set Recognition

Arxiv

0+阅读 · 2022年6月9日

Perceptual Quality Assessment for Fine-Grained Compressed Images

Arxiv

0+阅读 · 2022年6月8日

Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Arxiv

0+阅读 · 2022年6月7日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

相关基金

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

GPC3调控肝细胞癌侵袭和转移的信号通路研究

国家自然科学基金

0+阅读 · 2014年12月31日

缺氧肿瘤微环境在促进CD133+CXCR4-结肠癌细胞上皮间质转化中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土氧化物修饰三维介孔镍基催化剂对SOFC阳极抗积碳和强度研究

国家自然科学基金

0+阅读 · 2013年12月31日

Haccpper环境中不锈钢表面活性与电化学噪声特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

太阳大气中激波与冕环相互作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

呼伦贝尔羊草草甸草原CO2、CH4 和N2O 通量对放牧强度的响应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

CX3CL1/CX3CR1相互作用调控低氧前列腺癌细胞转移的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员