使用变化式自动编码器和反向培训的双级深层次代表学习式语音强化方法 (A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training) - 专知论文

会员服务 ·

0

估计/估计量 · 语音增强 · 自编码器 · 变分自编码 · 表示 ·

2022 年 11 月 16 日

A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training

翻译：使用变化式自动编码器和反向培训的双级深层次代表学习式语音强化方法

Yang Xiang,Jesper Lisby Højvang,Morten Højfeldt Rasmussen,Mads Græsbøll Christensen

from arxiv, Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing

This paper focuses on leveraging deep representation learning (DRL) for speech enhancement (SE). In general, the performance of the deep neural network (DNN) is heavily dependent on the learning of data representation. However, the DRL's importance is often ignored in many DNN-based SE algorithms. To obtain a higher quality enhanced speech, we propose a two-stage DRL-based SE method through adversarial training. In the first stage, we disentangle different latent variables because disentangled representations can help DNN generate a better enhanced speech. Specifically, we use the $\beta$-variational autoencoder (VAE) algorithm to obtain the speech and noise posterior estimations and related representations from the observed signal. However, since the posteriors and representations are intractable and we can only apply a conditional assumption to estimate them, it is difficult to ensure that these estimations are always pretty accurate, which may potentially degrade the final accuracy of the signal estimation. To further improve the quality of enhanced speech, in the second stage, we introduce adversarial training to reduce the effect of the inaccurate posterior towards signal reconstruction and improve the signal estimation accuracy, making our algorithm more robust for the potentially inaccurate posterior estimations. As a result, better SE performance can be achieved. The experimental results indicate that the proposed strategy can help similar DNN-based SE algorithms achieve higher short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), and scale-invariant signal-to-distortion ratio (SI-SDR) scores. Moreover, the proposed algorithm can also outperform recent competitive SE algorithms.

翻译：本文侧重于利用深层代表学习(DRL)来增强语言能力(SE) 。一般来说,深神经网络(DNN)的性能在很大程度上取决于数据代表性的学习。但是,在许多基于 DNN 的 SE 算法中,DRL的重要性常常被忽视。为了获得质量更高的强化演讲,我们建议通过对抗性培训,采用基于DRL的双阶段SE方法。在第一阶段,我们分解不同的潜在变量,因为分解的表达方式可以帮助DNNN产生更好的发言质量。具体地说,我们使用美元-变异自动计算(VADE)算法(VADE)的算法来获取最新信号显示的言词和噪声上的估计及相关的表述。然而,由于SDRLS的演算法非常棘手,我们只能用一个有条件的假设来估计它们。在第一阶段,这可能会降低基于信号的估计的最后准确性。为了进一步提高发言质量,在第二阶段,我们引入了对抗性培训,以降低Asrior的不准确度(VE-LIS) 的准确度评估可以使S 实现更准确的S 。

0

相关内容

估计/估计量

估计/估计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

IRF5诱导小胶质细胞极化在脑缺血后神经元损伤中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

钡锗基过渡族金属笼装物clathrate材料单晶生长、能带结构调控与非简谐振动机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

化痰通脉饮对PCOS的IRS-1-PI3K/AKT/NF-κB串流失控的调节效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kupffer细胞上GITRL在大鼠肝移植免疫耐受重建中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4 /ROS信号通路在AMD发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

压缩感知中采样与重建的理论及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

多发性硬化Th17和Treg细胞失衡的miRNA调控机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

PD-1/PD-L1途径通过Kupffer细胞诱导大鼠肝移植免疫耐受的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Fast matrix-free methods for model-based personalized synthetic MR imaging

Arxiv

0+阅读 · 2023年1月17日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Arxiv

15+阅读 · 2019年1月15日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

估计/估计量

变分自编码

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

相关论文

Fast matrix-free methods for model-based personalized synthetic MR imaging

Arxiv

0+阅读 · 2023年1月17日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Arxiv

15+阅读 · 2019年1月15日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

相关基金

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

IRF5诱导小胶质细胞极化在脑缺血后神经元损伤中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

钡锗基过渡族金属笼装物clathrate材料单晶生长、能带结构调控与非简谐振动机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

化痰通脉饮对PCOS的IRS-1-PI3K/AKT/NF-κB串流失控的调节效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kupffer细胞上GITRL在大鼠肝移植免疫耐受重建中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4 /ROS信号通路在AMD发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

压缩感知中采样与重建的理论及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

多发性硬化Th17和Treg细胞失衡的miRNA调控机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

PD-1/PD-L1途径通过Kupffer细胞诱导大鼠肝移植免疫耐受的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员