DiffSVC:歌声转换扩散概率模型 (DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion) - 专知论文

会员服务 ·

0

去噪 · Processing（编程语言） · MoDELS · INTERACT · INFORMS ·

2021 年 5 月 28 日

DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion

翻译：DiffSVC:歌声转换扩散概率模型

Songxiang Liu,Yuewen Cao,Dan Su,Helen Meng

from arxiv, Preprint. 8 pages, 2 figures and 1 table

Singing voice conversion (SVC) is one promising technique which can enrich the way of human-computer interaction by endowing a computer the ability to produce high-fidelity and expressive singing voice. In this paper, we propose DiffSVC, an SVC system based on denoising diffusion probabilistic model. DiffSVC uses phonetic posteriorgrams (PPGs) as content features. A denoising module is trained in DiffSVC, which takes destroyed mel spectrogram produced by the diffusion/forward process and its corresponding step information as input to predict the added Gaussian noise. We use PPGs, fundamental frequency features and loudness features as auxiliary input to assist the denoising process. Experiments show that DiffSVC can achieve superior conversion performance in terms of naturalness and voice similarity to current state-of-the-art SVC approaches.

翻译：唱声转换( SVC) 是一种很有希望的技术,它能够通过赋予计算机以产生高菲度和表达式歌声的能力来丰富人类-计算机互动的方式。在本文中,我们建议DiffSVC(基于分流扩散概率模型的SVC系统)作为基于分流扩散概率模型的SVC(SiffSVC)系统,DiffSVC(PPGs)使用语音后方位转换(PPPGs)作为内容特性。DiffSVC(DiffSVC)培训了一种分流模块,该模块将扩散/前方过程及其相应的步骤信息产生的被摧毁的光谱作为预测增加的高斯噪音的投入。我们使用PPGs(PPGs)、基本频率特征和声响度功能作为辅助投入,以协助分流过程。实验显示DiffSVC(PPC)在自然性和声音方面可以实现优异性转换性表现,并类似于当前最先进的SVC(SVC)方法。

0

相关内容

【CVPR2021】动态度量学习

【CVPR2021】动态度量学习

专知会员服务

41+阅读 · 2021年3月30日

【CVPR2021】自监督几何感知

【CVPR2021】自监督几何感知

专知会员服务

46+阅读 · 2021年3月6日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

专知会员服务

27+阅读 · 2020年4月4日

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

专知会员服务

24+阅读 · 2020年3月31日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

3+阅读 · 2019年1月15日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

On Prosody Modeling for ASR+TTS based Voice Conversion

On Prosody Modeling for ASR+TTS based Voice Conversion

Arxiv

0+阅读 · 2021年7月20日

Evaluating Probabilistic Inference in Deep Learning: Beyond Marginal Predictions

Arxiv

0+阅读 · 2021年7月20日

A Topological Perspective on Causal Inference

Arxiv

1+阅读 · 2021年7月18日

Probabilistic Process Algebra for True Concurrency

Arxiv

0+阅读 · 2021年7月18日

An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation

Arxiv

0+阅读 · 2021年7月18日

Beyond In-Place Corruption: Insertion and Deletion In Denoising Probabilistic Models

Arxiv

0+阅读 · 2021年7月16日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline)

Arxiv

7+阅读 · 2018年1月9日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【CVPR2021】动态度量学习

【CVPR2021】动态度量学习

专知会员服务

41+阅读 · 2021年3月30日

【CVPR2021】自监督几何感知

【CVPR2021】自监督几何感知

专知会员服务

46+阅读 · 2021年3月6日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

专知会员服务

27+阅读 · 2020年4月4日

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

专知会员服务

24+阅读 · 2020年3月31日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】用于化学结构抽取的多模态文档理解

《人工智能在军事行动作战规划过程中的应用可能性》

【NeurIPS2025】熵正则化与分布强化学习的收敛定理

智能体安全综述：应用、威胁与防御

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

3+阅读 · 2019年1月15日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

On Prosody Modeling for ASR+TTS based Voice Conversion

On Prosody Modeling for ASR+TTS based Voice Conversion

Arxiv

0+阅读 · 2021年7月20日

Evaluating Probabilistic Inference in Deep Learning: Beyond Marginal Predictions

Arxiv

0+阅读 · 2021年7月20日

A Topological Perspective on Causal Inference

Arxiv

1+阅读 · 2021年7月18日

Probabilistic Process Algebra for True Concurrency

Arxiv

0+阅读 · 2021年7月18日

An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation

Arxiv

0+阅读 · 2021年7月18日

Beyond In-Place Corruption: Insertion and Deletion In Denoising Probabilistic Models

Arxiv

0+阅读 · 2021年7月16日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline)

Arxiv

7+阅读 · 2018年1月9日

微信扫码咨询专知VIP会员