使用 " 周期内循环GAN " 进行非平行语音增强,使用联合数量估计和分阶段恢复 (Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement) - 专知论文

会员服务 ·

0

估计/估计量 · 语音增强 · CycleGAN · INFORMS · Pair ·

2022 年 2 月 14 日

Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement

翻译：使用 " 周期内循环GAN " 进行非平行语音增强,使用联合数量估计和分阶段恢复

Guochen Yu,Andong Li,Yutian Wang,Yinuo Guo,Hui Wang,Chengshi Zheng

from arxiv, Accecpted by ICASSP 2022

For the lack of adequate paired noisy-clean speech corpus in many real scenarios, non-parallel training is a promising task for DNN-based speech enhancement methods. However, because of the severe mismatch between input and target speeches, many previous studies only focus on the magnitude spectrum estimation and remain the phase unaltered, resulting in the degraded speech quality under low signal-to-noise ratio conditions. To tackle this problem, we decouple the difficult target w.r.t. original spectrum optimization into spectral magnitude and phase, and a novel Cycle-in-Cycle generative adversarial network (dubbed CinCGAN) is proposed to jointly estimate the spectral magnitude and phase information stage by stage under unpaired data. In the first stage, we pretrain a magnitude CycleGAN to coarsely estimate the spectral magnitude of clean speech. In the second stage, we incorporate the pretrained CycleGAN with a complex-valued CycleGAN as a cycle-in-cycle structure to simultaneously recover phase information and refine the overall spectrum. Experimental results demonstrate that the proposed approach significantly outperforms previous baselines under non-parallel training. The evaluation on training the models with standard paired data also shows that CinCGAN achieves remarkable performance especially in reducing background noise and speech distortion.

翻译：由于在许多现实情景中缺乏适当的对称噪音清洁言语保护,非平行培训对于DNN的语音强化方法来说是一项很有希望的任务,然而,由于投入与目标演讲之间严重不匹配,许多先前的研究仅侧重于频谱量估计和未改变阶段,导致在信号与噪音比率低的情况下语言质量下降,导致在信号与噪音比率低的情况下,言语质量下降。为解决这一问题,我们将原创频谱优化纳入光谱规模和阶段,并建立一个新型的循环内基因对抗网络(dubbbed CinCGAN),以在未受影响的数据中按阶段联合估计光谱量和阶段信息阶段。在第一阶段,我们预设一个大规模循环GAN,以粗略地估计清洁言语的光度。在第二阶段,我们将受过培训的CcycanGAN与一个价值复杂的循环GAN作为循环结构,以同时恢复阶段信息并改进总体频谱。实验结果显示,拟议的C级阵列方法大大超出C级发言量和阶段性分析模型下的前基数级模型,还显示在非标准级的C级演讲培训中还完成了。

0

相关内容

估计/估计量

估计/估计量

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

997篇-历史最全生成对抗网络（GAN）论文串烧

997篇-历史最全生成对抗网络（GAN）论文串烧

深度学习与NLP

16+阅读 · 2018年6月26日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

家蚕中影响RNA干扰效率的因子与dsRNA相互作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

RSK2介导Ras/MAPK对PTEN/Akt的调控作用：肠癌EGFR单抗获得性耐药的新机制？

国家自然科学基金

0+阅读 · 2015年12月31日

高糖影响肺动脉平滑肌细胞收缩增殖的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

全麦体系阿拉伯木聚糖对面筋蛋白网络形成的干预与调节机制

国家自然科学基金

0+阅读 · 2014年12月31日

自适应多分辨率宽带频谱压缩感知

国家自然科学基金

0+阅读 · 2012年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型钠钙交换蛋白NCEX对糖尿病大血管病变的作用和黄芪多糖的干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

NLS1介导的转录因子Arx在细胞核定位的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Arxiv

0+阅读 · 2022年4月20日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Arxiv

0+阅读 · 2022年4月18日

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation

Arxiv

0+阅读 · 2022年4月18日

Synthesizing Informative Training Samples with GAN

Synthesizing Informative Training Samples with GAN

Arxiv

0+阅读 · 2022年4月15日

Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model

Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model

Arxiv

0+阅读 · 2022年4月15日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

Consecutive Decoding for Speech-to-text Translation

Arxiv

0+阅读 · 2022年4月15日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

997篇-历史最全生成对抗网络（GAN）论文串烧

997篇-历史最全生成对抗网络（GAN）论文串烧

深度学习与NLP

16+阅读 · 2018年6月26日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Arxiv

0+阅读 · 2022年4月20日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Arxiv

0+阅读 · 2022年4月18日

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation

Arxiv

0+阅读 · 2022年4月18日

Synthesizing Informative Training Samples with GAN

Synthesizing Informative Training Samples with GAN

Arxiv

0+阅读 · 2022年4月15日

Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model

Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model

Arxiv

0+阅读 · 2022年4月15日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

Consecutive Decoding for Speech-to-text Translation

Arxiv

0+阅读 · 2022年4月15日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

相关基金

家蚕中影响RNA干扰效率的因子与dsRNA相互作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

RSK2介导Ras/MAPK对PTEN/Akt的调控作用：肠癌EGFR单抗获得性耐药的新机制？

国家自然科学基金

0+阅读 · 2015年12月31日

高糖影响肺动脉平滑肌细胞收缩增殖的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

全麦体系阿拉伯木聚糖对面筋蛋白网络形成的干预与调节机制

国家自然科学基金

0+阅读 · 2014年12月31日

自适应多分辨率宽带频谱压缩感知

国家自然科学基金

0+阅读 · 2012年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型钠钙交换蛋白NCEX对糖尿病大血管病变的作用和黄芪多糖的干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

NLS1介导的转录因子Arx在细胞核定位的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员