改进私言的自然、智能和语音的生成模型 (Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech) - 专知论文

会员服务 ·

0

规范化的 · 生成模型 · MoDELS · 基准 · Performer ·

2022 年 12 月 4 日

Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech

翻译：改进私言的自然、智能和语音的生成模型

Dominik Wagner,Sebastian P. Bayerl,Hector A. Cordourier Maruri,Tobias Bocklet

from arxiv, Accepted at SLT 2022

This work adapts two recent architectures of generative models and evaluates their effectiveness for the conversion of whispered speech to normal speech. We incorporate the normal target speech into the training criterion of vector-quantized variational autoencoders (VQ-VAEs) and MelGANs, thereby conditioning the systems to recover voiced speech from whispered inputs. Objective and subjective quality measures indicate that both VQ-VAEs and MelGANs can be modified to perform the conversion task. We find that the proposed approaches significantly improve the Mel cepstral distortion (MCD) metric by at least 25% relative to a DiscoGAN baseline. Subjective listening tests suggest that the MelGAN-based system significantly improves naturalness, intelligibility, and voicing compared to the whispered input speech. A novel evaluation measure based on differences between latent speech representations also indicates that our MelGAN-based approach yields improvements relative to the baseline.

翻译：这项工作调整了两个最新的基因模型结构,并评价了这些模型在将低声语音转换为正常语音方面的效力。我们把正常目标演讲纳入了病媒定量变异自动转换器(VQ-VAEs)和MelGANs的培训标准,从而为系统从低声输入中恢复语音提供了条件。客观和主观质量措施表明,VQ-VAEs和MelGANs都可以修改来完成转换任务。我们发现,拟议的方法大大改善了Mel Cepstral扭曲(MCD)衡量标准,比DiscoGAN基线至少提高了25%。主观的监听测试表明,以MelGAN为基础的系统大大改善了自然性、智能性和与低语输入演讲相比的表达方式。基于潜在语音表现差异的新评价措施还表明,我们基于MelGAN(MelGAN)的方法比基准提高了25%。

0

相关内容

规范化的

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

SMYD3调控Wnt/β-catenin信号通路的分子机制及其在肝细胞癌中功能的研究

国家自然科学基金

0+阅读 · 2015年12月31日

天然活性分子Isatin抗神经母细胞瘤转移的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于天然产物Drimenal的新型杀菌剂分子设计、合成及构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

PDMS－PA多嵌段共聚物的超临界流体辅助熔融反应制备新技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

DNA甲基化在低剂量三氯乙烯诱导Th1/Th2/Th17/Treg平衡失衡导致自身免疫性疾病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

High Resolution Face Editing with Masked GAN Latent Code Optimization

Arxiv

0+阅读 · 2023年2月6日

Bayesian Optimization of Multiple Objectives with Different Latencies

Arxiv

0+阅读 · 2023年2月2日

Stochastic Optimization for Counterfactual Explanations

Arxiv

0+阅读 · 2023年2月2日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

因果强化学习的统一框架：综述、分类体系、算法与应用

《无人机系统 - 反无人机系统：测试方法》364页

【MIT博士论文】语言模型的推理时学习算法

美军低成本无人作战攻击系统（LUCAS）：扩大无人机战争规模

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

相关论文

High Resolution Face Editing with Masked GAN Latent Code Optimization

Arxiv

0+阅读 · 2023年2月6日

Bayesian Optimization of Multiple Objectives with Different Latencies

Arxiv

0+阅读 · 2023年2月2日

Stochastic Optimization for Counterfactual Explanations

Arxiv

0+阅读 · 2023年2月2日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

相关基金

SMYD3调控Wnt/β-catenin信号通路的分子机制及其在肝细胞癌中功能的研究

国家自然科学基金

0+阅读 · 2015年12月31日

天然活性分子Isatin抗神经母细胞瘤转移的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于天然产物Drimenal的新型杀菌剂分子设计、合成及构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

PDMS－PA多嵌段共聚物的超临界流体辅助熔融反应制备新技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

DNA甲基化在低剂量三氯乙烯诱导Th1/Th2/Th17/Treg平衡失衡导致自身免疫性疾病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

微信扫码咨询专知VIP会员