适用于语音频谱建模的动态变化自动生成器基准 (A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling) - 专知论文

会员服务 ·

0

变分自编码 · Extensibility · 自编码器 · MoDELS · 复合数据 ·

2021 年 6 月 14 日

A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling

翻译：适用于语音频谱建模的动态变化自动生成器基准

Xiaoyu Bie,Laurent Girin,Simon Leglaive,Thomas Hueber,Xavier Alameda-Pineda

from arxiv, Accepted to Interspeech 2021. arXiv admin note: text overlap with arXiv:2008.12595

The Variational Autoencoder (VAE) is a powerful deep generative model that is now extensively used to represent high-dimensional complex data via a low-dimensional latent space learned in an unsupervised manner. In the original VAE model, input data vectors are processed independently. In recent years, a series of papers have presented different extensions of the VAE to process sequential data, that not only model the latent space, but also model the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks. We recently performed a comprehensive review of those models and unified them into a general class called Dynamical Variational Autoencoders (DVAEs). In the present paper, we present the results of an experimental benchmark comparing six of those DVAE models on the speech analysis-resynthesis task, as an illustration of the high potential of DVAEs for speech modeling.

翻译：变异自动编码器(VAE)是一个强大的深深层基因模型,目前广泛用于通过一个以不受监督的方式学习的低维潜层空间代表高维复杂数据。在原VAE模型中,输入数据矢量是独立处理的。近年来,一系列论文展示了VAE在处理连续数据方面的不同扩展,不仅模拟了潜在空间,而且还模拟了数据矢量序列和相应潜在矢量中的时间依赖性,并依赖经常性神经网络。我们最近对这些模型进行了全面审查,并将其统一成一个称为动态变异自动变异器(DVAEs)的一般类。我们在本文件中介绍了对语音分析-合成任务中DVAE模型中六个模型进行比较的实验性基准结果,以说明DVAEs进行语音建模的高度潜力。

0

相关内容

变分自编码

变分自编码

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

深度伪造与检测技术综述

专知会员服务

74+阅读 · 2020年12月12日

多模态视觉语言表征学习研究综述

多模态视觉语言表征学习研究综述

专知会员服务

194+阅读 · 2020年12月3日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

vae学习笔记

vae学习笔记

CreateAMind

22+阅读 · 2019年6月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

【推荐】深度学习情感分析综述

【推荐】深度学习情感分析综述

机器学习研究会

58+阅读 · 2018年1月26日

【推荐】RNN最新研究进展综述

【推荐】RNN最新研究进展综述

机器学习研究会

26+阅读 · 2018年1月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Arxiv

0+阅读 · 2021年8月16日

Dereverberation of Autoregressive Envelopes for Far-field Speech Recognition

Arxiv

0+阅读 · 2021年8月12日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

SpectralNet: Spectral Clustering using Deep Neural Networks

Arxiv

11+阅读 · 2018年1月10日

VIP会员

文章信息

相关主题

变分自编码

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

深度伪造与检测技术综述

专知会员服务

74+阅读 · 2020年12月12日

多模态视觉语言表征学习研究综述

多模态视觉语言表征学习研究综述

专知会员服务

194+阅读 · 2020年12月3日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

vae学习笔记

vae学习笔记

CreateAMind

22+阅读 · 2019年6月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

【推荐】深度学习情感分析综述

【推荐】深度学习情感分析综述

机器学习研究会

58+阅读 · 2018年1月26日

【推荐】RNN最新研究进展综述

【推荐】RNN最新研究进展综述

机器学习研究会

26+阅读 · 2018年1月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Arxiv

0+阅读 · 2021年8月16日

Dereverberation of Autoregressive Envelopes for Far-field Speech Recognition

Arxiv

0+阅读 · 2021年8月12日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

SpectralNet: Spectral Clustering using Deep Neural Networks

Arxiv

11+阅读 · 2018年1月10日

微信扫码咨询专知VIP会员