噪音目标发言者高质量语音转换学习独立语音代表 (Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers) - 专知论文

会员服务 ·

0

Learning · MoDELS · 潜在 · 表示 · 音素 ·

2022 年 7 月 2 日

Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers

翻译：噪音目标发言者高质量语音转换学习独立语音代表

Liumeng Xue,Shan Yang,Na Hu,Dan Su,Lei Xie

from arxiv, Accepted by INTERSPEECH 2022

Building a voice conversion system for noisy target speakers, such as users providing noisy samples or Internet found data, is a challenging task since the use of contaminated speech in model training will apparently degrade the conversion performance. In this paper, we leverage the advances of our recently proposed Glow-WaveGAN and propose a noise-independent speech representation learning approach for high-quality voice conversion for noisy target speakers. Specifically, we learn a latent feature space where we ensure that the target distribution modeled by the conversion model is exactly from the modeled distribution of the waveform generator. With this premise, we further manage to make the latent feature to be noise-invariant. Specifically, we introduce a noise-controllable WaveGAN, which directly learns the noise-independent acoustic representation from waveform by the encoder and conducts noise control in the hidden space through a FiLM module in the decoder. As for the conversion model, importantly, we use a flow-based model to learn the distribution of noise-independent but speaker-related latent features from phoneme posteriorgrams. Experimental results demonstrate that the proposed model achieves high speech quality and speaker similarity in the voice conversion for noisy target speakers.

翻译：在本文中,我们利用我们最近提议的Glow-WaveGAN的进度,为噪音目标发言者的高质量语音转换提出一个无噪音的语音代表学习方法。具体地说,我们学习了一个潜在特征空间,确保转换模式所模拟的目标分布完全来自波形发电机的模型分布。在此前提下,我们进一步设法使潜在特征成为噪音变异。具体地说,我们引入了一种噪音控制波式WaveGAN,直接从编码器的波形中学习噪音依赖声的声学代表,并通过解码器的FILM模块在隐蔽空间进行噪音控制。关于转换模式,我们使用流基模型来学习语音不依赖声音但与声音相关的潜伏特征的分布。实验结果显示,拟议模型在语音变换中实现了高语音质量和声音变压目标相似性。

0

相关内容

Learning

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

Heisenberg 群上的 k-平面变换

国家自然科学基金

0+阅读 · 2015年12月31日

微通道内非牛顿流体-弹性颗粒液固两相流动特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

控释VEGF/NT-3脊髓脱细胞支架在SCI模型中的血管化及神经再生研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

液态Ti3Al基合金的微观结构研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于压缩网络编码的无线传感网隐私与安全机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

阵列天线3D-SAR的DEM生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

非牛顿流磁流体动力学方程的数值方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

图和复杂网络的谱分析

国家自然科学基金

1+阅读 · 2009年12月31日

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

Arxiv

0+阅读 · 2022年8月23日

WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer Conditional Adversarial Training

Arxiv

0+阅读 · 2022年8月22日

NOSMOG: Learning Noise-robust and Structure-aware MLPs on Graphs

Arxiv

0+阅读 · 2022年8月22日

Learning Speaker-specific Lip-to-Speech Generation

Arxiv

0+阅读 · 2022年8月20日

General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation

Arxiv

0+阅读 · 2022年8月20日

Federated Learning with Noisy Labels

Federated Learning with Noisy Labels

Arxiv

0+阅读 · 2022年8月19日

DAFT: Distilling Adversarially Fine-tuned Models for Better OOD Generalization

Arxiv

0+阅读 · 2022年8月19日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《太空边缘（临近空间）的武器化？军事高空平台的进展与前景》

《利用星基增强系统（SBAS）信号进行射频干扰（RFI）检测与特征分析》

美陆军在“艾布拉姆斯”坦克与“布拉德利”步战车上测试“牛蛙”反无人机炮塔

《军事领域特性及其对军事人工智能应用的影响》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

Arxiv

0+阅读 · 2022年8月23日

WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer Conditional Adversarial Training

Arxiv

0+阅读 · 2022年8月22日

NOSMOG: Learning Noise-robust and Structure-aware MLPs on Graphs

Arxiv

0+阅读 · 2022年8月22日

Learning Speaker-specific Lip-to-Speech Generation

Arxiv

0+阅读 · 2022年8月20日

General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation

Arxiv

0+阅读 · 2022年8月20日

Federated Learning with Noisy Labels

Federated Learning with Noisy Labels

Arxiv

0+阅读 · 2022年8月19日

DAFT: Distilling Adversarially Fine-tuned Models for Better OOD Generalization

Arxiv

0+阅读 · 2022年8月19日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

Heisenberg 群上的 k-平面变换

国家自然科学基金

0+阅读 · 2015年12月31日

微通道内非牛顿流体-弹性颗粒液固两相流动特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

控释VEGF/NT-3脊髓脱细胞支架在SCI模型中的血管化及神经再生研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

液态Ti3Al基合金的微观结构研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于压缩网络编码的无线传感网隐私与安全机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

阵列天线3D-SAR的DEM生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

非牛顿流磁流体动力学方程的数值方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

图和复杂网络的谱分析

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员