ItôTTS 和 ItôWave: 音频生成所需的所有你所需的线性斯托卡差异方程式 (ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation) - 专知论文

会员服务 ·

0

易处理的 · 线性的 · SimPLe · Pair · 语音合成 ·

2022 年 1 月 29 日

ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation

翻译：ItôTTS 和 ItôWave: 音频生成所需的所有你所需的线性斯托卡差异方程式

Shoule Wu,Ziqiang Shi

from arxiv, The generated audio samples are available at https://wushoule.github.io/ItoAudio/

In this paper, we propose to unify the two aspects of voice synthesis, namely text-to-speech (TTS) and vocoder, into one framework based on a pair of forward and reverse-time linear stochastic differential equations (SDE). The solutions of this SDE pair are two stochastic processes, one of which turns the distribution of mel spectrogram (or wave), that we want to generate, into a simple and tractable distribution. The other is the generation procedure that turns this tractable simple signal into the target mel spectrogram (or wave). The model that generates mel spectrogram is called It\^oTTS, and the model that generates wave is called It\^oWave. It\^oTTS and It\^oWave use the Wiener process as a driver to gradually subtract the excess signal from the noise signal to generate realistic corresponding meaningful mel spectrogram and audio respectively, under the conditional inputs of original text or mel spectrogram. The results of the experiment show that the mean opinion scores (MOS) of It\^oTTS and It\^oWave can exceed the current state-of-the-art methods, and reached 3.925$\pm$0.160 and 4.35$\pm$0.115 respectively. The generated audio samples are available at https://wushoule.github.io/ItoAudio/. All authors contribute equally to this work.

翻译：在本文中,我们提议将语音合成的两个方面,即文本到语音(TTS)和vocoder(vocoder)合并成一个框架,其基础是一对前方和反向线性线性随机差异方程(SDE)。SDE对的解决方案是两个随机过程,其中之一是将我们想要生成的mel光谱(或波)的分布转换成一个简单和可移动的分布。另一个是将这一可移动的简单信号转换成目标Mel光谱(或波)的生成程序。生成Mel光谱的模型称为IT&oTTS,而生成波的模型称为ItáoWave。It ⁇ oTTS和ItooWave使用Wiener进程作为驱动器,逐渐从噪音信号中减去多余的信号,在原始文本或Mel光谱的有条件投入下产生现实对应的有意义的光谱和音频。实验结果显示,在IT&T_TAT$/Ix_25和I_WO的样本中,所有作者都能够超过当前状态。

0

相关内容

易处理的

不可错过！华盛顿大学最新《生成式模型》课程，附PPT

不可错过！华盛顿大学最新《生成式模型》课程，附PPT

专知会员服务

65+阅读 · 2020年12月11日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

p53转录后多样化修饰在糖尿病心肌纤维化的发生及进展中的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

氮原子α位C-H键的官能团化研究

国家自然科学基金

0+阅读 · 2015年12月31日

小分子化合物组合诱导成纤维细胞转分化为神经干细胞

国家自然科学基金

0+阅读 · 2013年12月31日

基于Volterra核函数的强电磁脉冲效应预测研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

信息处理中的随机共振理论及应用

国家自然科学基金

0+阅读 · 2011年12月31日

关于矩阵元素的组合分析

国家自然科学基金

0+阅读 · 2009年12月31日

矩阵代数及其在时滞动力系统中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Composite Anomaly Detection via Hierarchical Dynamic Search

Arxiv

0+阅读 · 2022年4月20日

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Arxiv

0+阅读 · 2022年4月19日

Single-Channel Speech Dereverberation using Subband Network with A Reverberation Time Shortening Target

Arxiv

0+阅读 · 2022年4月19日

A Score-based Geometric Model for Molecular Dynamics Simulations

Arxiv

0+阅读 · 2022年4月19日

Multilevel Picard approximations for high-dimensional decoupled forward-backward stochastic differential equations

Arxiv

2+阅读 · 2022年4月18日

Estimation of smooth functionals in high-dimensional models: bootstrap chains and Gaussian approximation

Arxiv

0+阅读 · 2022年4月16日

An error analysis of generative adversarial networks for learning distributions

Arxiv

0+阅读 · 2022年4月16日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！华盛顿大学最新《生成式模型》课程，附PPT

不可错过！华盛顿大学最新《生成式模型》课程，附PPT

专知会员服务

65+阅读 · 2020年12月11日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Composite Anomaly Detection via Hierarchical Dynamic Search

Arxiv

0+阅读 · 2022年4月20日

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Arxiv

0+阅读 · 2022年4月19日

Single-Channel Speech Dereverberation using Subband Network with A Reverberation Time Shortening Target

Arxiv

0+阅读 · 2022年4月19日

A Score-based Geometric Model for Molecular Dynamics Simulations

Arxiv

0+阅读 · 2022年4月19日

Multilevel Picard approximations for high-dimensional decoupled forward-backward stochastic differential equations

Arxiv

2+阅读 · 2022年4月18日

Estimation of smooth functionals in high-dimensional models: bootstrap chains and Gaussian approximation

Arxiv

0+阅读 · 2022年4月16日

An error analysis of generative adversarial networks for learning distributions

Arxiv

0+阅读 · 2022年4月16日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

相关基金

p53转录后多样化修饰在糖尿病心肌纤维化的发生及进展中的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

氮原子α位C-H键的官能团化研究

国家自然科学基金

0+阅读 · 2015年12月31日

小分子化合物组合诱导成纤维细胞转分化为神经干细胞

国家自然科学基金

0+阅读 · 2013年12月31日

基于Volterra核函数的强电磁脉冲效应预测研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

信息处理中的随机共振理论及应用

国家自然科学基金

0+阅读 · 2011年12月31日

关于矩阵元素的组合分析

国家自然科学基金

0+阅读 · 2009年12月31日

矩阵代数及其在时滞动力系统中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员