StarGAN-VC+ASR:基于StarGAN的以StarGAN为基础的非Parallel语音转换,通过自动语音识别正规化 (StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition) - 专知论文

会员服务 ·

0

自动语音识别 · 正则化项 · 语音识别 · INFORMS · 训练样本 ·

2021 年 8 月 10 日

StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition

翻译：StarGAN-VC+ASR:基于StarGAN的以StarGAN为基础的非Parallel语音转换,通过自动语音识别正规化

Shoki Sakamoto,Akira Taniguchi,Tadahiro Taniguchi,Hirokazu Kameoka

from arxiv, 5 pages, 6 figures, Accepted to INTERSPEECH 2021

Preserving the linguistic content of input speech is essential during voice conversion (VC). The star generative adversarial network-based VC method (StarGAN-VC) is a recently developed method that allows non-parallel many-to-many VC. Although this method is powerful, it can fail to preserve the linguistic content of input speech when the number of available training samples is extremely small. To overcome this problem, we propose the use of automatic speech recognition to assist model training, to improve StarGAN-VC, especially in low-resource scenarios. Experimental results show that using our proposed method, StarGAN-VC can retain more linguistic information than vanilla StarGAN-VC.

翻译：在语音转换期间,必须保留输入语言的语言内容。星体基因对抗网络VC方法(StarGAN-VC)是最近开发的一种方法,允许许多到许多不平行的VC。虽然这种方法很有力,但当现有培训样本数量极小时,它可能无法保留输入语言的语言内容。为了解决这一问题,我们提议使用自动语音识别来协助模式培训,改进StarGAN-VC,特别是在低资源情况下。实验结果表明,使用我们提议的方法,StarGAN-VC可以保留比香草StarGAN-VC更多的语言信息。

0

相关内容

自动语音识别

自动语音识别

生成对抗网络GAN在各领域应用研究进展(中文版)，37页pdf

生成对抗网络GAN在各领域应用研究进展(中文版)，37页pdf

专知会员服务

151+阅读 · 2020年12月30日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

最新《生成式对抗网络》简介，25页ppt

最新《生成式对抗网络》简介，25页ppt

专知会员服务

175+阅读 · 2020年6月28日

【伯克利】最新《生成式对抗网络》技术综述课程，257页ppt带你学习GAN进展

【伯克利】最新《生成式对抗网络》技术综述课程，257页ppt带你学习GAN进展

专知会员服务

193+阅读 · 2020年5月3日

【InterSpeech2020】混合语音识别系统中的词汇扩展技术，Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

【InterSpeech2020】混合语音识别系统中的词汇扩展技术，Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

专知会员服务

17+阅读 · 2020年3月23日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【GAN】生成式对抗网络GAN在语音自然语言处理中的应用，台大李宏毅老师，附247页ppt下载

【GAN】生成式对抗网络GAN在语音自然语言处理中的应用，台大李宏毅老师，附247页ppt下载

专知会员服务

115+阅读 · 2019年11月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

生成式对抗网络GAN异常检测

生成式对抗网络GAN异常检测

专知会员服务

118+阅读 · 2019年10月13日

已删除

将门创投

11+阅读 · 2019年8月13日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning

Arxiv

0+阅读 · 2021年10月7日

Unsupervised Speech Recognition

Arxiv

0+阅读 · 2021年10月7日

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR

Arxiv

0+阅读 · 2021年10月7日

GANtron: Emotional Speech Synthesis with Generative Adversarial Networks

Arxiv

0+阅读 · 2021年10月6日

Curriculum Pre-training for End-to-End Speech Translation

Arxiv

4+阅读 · 2020年4月21日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Teacher-Student Training for Robust Tacotron-based TTS

Teacher-Student Training for Robust Tacotron-based TTS

Arxiv

5+阅读 · 2019年11月7日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment

Arxiv

3+阅读 · 2018年9月4日

High-quality nonparallel voice conversion based on cycle-consistent adversarial network

Arxiv

4+阅读 · 2018年4月2日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

生成对抗网络GAN在各领域应用研究进展(中文版)，37页pdf

生成对抗网络GAN在各领域应用研究进展(中文版)，37页pdf

专知会员服务

151+阅读 · 2020年12月30日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

最新《生成式对抗网络》简介，25页ppt

最新《生成式对抗网络》简介，25页ppt

专知会员服务

175+阅读 · 2020年6月28日

【伯克利】最新《生成式对抗网络》技术综述课程，257页ppt带你学习GAN进展

【伯克利】最新《生成式对抗网络》技术综述课程，257页ppt带你学习GAN进展

专知会员服务

193+阅读 · 2020年5月3日

【InterSpeech2020】混合语音识别系统中的词汇扩展技术，Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

【InterSpeech2020】混合语音识别系统中的词汇扩展技术，Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

专知会员服务

17+阅读 · 2020年3月23日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【GAN】生成式对抗网络GAN在语音自然语言处理中的应用，台大李宏毅老师，附247页ppt下载

【GAN】生成式对抗网络GAN在语音自然语言处理中的应用，台大李宏毅老师，附247页ppt下载

专知会员服务

115+阅读 · 2019年11月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

生成式对抗网络GAN异常检测

生成式对抗网络GAN异常检测

专知会员服务

118+阅读 · 2019年10月13日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | 自动化所新作速览（一）

大型语言模型（LLM）赋能的知识图谱构建：综述

NeurIPS 2025 | 自动化所新作速览（二）

领域特定文本分类中的预训练语言模型新进展：系统综述

相关资讯

已删除

将门创投

11+阅读 · 2019年8月13日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning

Arxiv

0+阅读 · 2021年10月7日

Unsupervised Speech Recognition

Arxiv

0+阅读 · 2021年10月7日

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR

Arxiv

0+阅读 · 2021年10月7日

GANtron: Emotional Speech Synthesis with Generative Adversarial Networks

Arxiv

0+阅读 · 2021年10月6日

Curriculum Pre-training for End-to-End Speech Translation

Arxiv

4+阅读 · 2020年4月21日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Teacher-Student Training for Robust Tacotron-based TTS

Teacher-Student Training for Robust Tacotron-based TTS

Arxiv

5+阅读 · 2019年11月7日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment

Arxiv

3+阅读 · 2018年9月4日

High-quality nonparallel voice conversion based on cycle-consistent adversarial network

Arxiv

4+阅读 · 2018年4月2日

微信扫码咨询专知VIP会员