由不受监督的自动语音识别自动语音识别进行不受监督的文本到语音合成 (Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition) - 专知论文

会员服务 ·

0

无监督 · 语音合成 · 自动语音识别 · HTTPS · 语音识别 ·

2022 年 8 月 15 日

Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition

翻译：由不受监督的自动语音识别自动语音识别进行不受监督的文本到语音合成

Junrui Ni,Liming Wang,Heting Gao,Kaizhi Qian,Yang Zhang,Shiyu Chang,Mark Hasegawa-Johnson

from arxiv, INTERSPEECH 2022

An unsupervised text-to-speech synthesis (TTS) system learns to generate speech waveforms corresponding to any written sentence in a language by observing: 1) a collection of untranscribed speech waveforms in that language; 2) a collection of texts written in that language without access to any transcribed speech. Developing such a system can significantly improve the availability of speech technology to languages without a large amount of parallel speech and text data. This paper proposes an unsupervised TTS system based on an alignment module that outputs pseudo-text and another synthesis module that uses pseudo-text for training and real text for inference. Our unsupervised system can achieve comparable performance to the supervised system in seven languages with about 10-20 hours of speech each. A careful study on the effect of text units and vocoders has also been conducted to better understand what factors may affect unsupervised TTS performance. The samples generated by our models can be found at https://cactuswiththoughts.github.io/UnsupTTS-Demo, and our code can be found at https://github.com/lwang114/UnsupTTS.

翻译：一个未经监督的文本到语音合成系统(TTS)通过观察来学习生成一种语言中任何书面句子对应的语音波形:1)一个语言中未受管制的语音波形集;2)一个语言文本集,无法调出任何转录语音;2)一个语言文本集,无法查阅任何转录语音;开发这样一个系统可以大大改善语言语言语言的语音技术可用性,而没有大量的平行语音和文本数据。本文件建议建立一个不受监督的 TTS 系统,该系统基于一个校准模块,该模块输出假文本和另一个合成模块,使用假文本进行培训,并使用真实文本进行推断。我们未经监督的系统可以用七种语言实现与监督系统的类似性能,每个语言有大约10-20小时的语音。还仔细研究了文本单元和vocoders的效果,以更好地了解哪些因素可能影响不受监督的 TTS的性能。我们的模型产生的样本可以在 https://cactuswiths.github.io/UnsupTTS-Demo找到我们的代码,可在 https://github.sl/wang/SUnsupup.

0

相关内容

无监督

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

次级无轭部双边磁通切换直线电机及其控制系统研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于肿瘤标志物检测的稀土掺杂超小MF2(M=Ca,Sr,Ba)纳米荧光探针及其发光物理

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

同轴相对论返波管中等离子体与波互作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU的相似波形快速检索方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

姿态气动耦合的高超声速飞行器分块建模及鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

共振Schottky探针研制

国家自然科学基金

0+阅读 · 2012年12月31日

转铁蛋白偶联的超顺磁性-荧光纳米探针双重靶向标记脑胶质瘤的研究

国家自然科学基金

0+阅读 · 2011年12月31日

W光子晶体光纤滤波机理与特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

ICF中高能电子和离子输运的Monte-Carlo算法研究和程序研制

国家自然科学基金

0+阅读 · 2009年12月31日

TextMatcher: Cross-Attentional Neural Network to Compare Image and Text

Arxiv

1+阅读 · 2022年10月6日

PSENet: Progressive Self-Enhancement Network for Unsupervised Extreme-Light Image Enhancement

Arxiv

0+阅读 · 2022年10月3日

Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data

Arxiv

0+阅读 · 2022年10月2日

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

Arxiv

0+阅读 · 2022年9月30日

Cloud Classification with Unsupervised Deep Learning

Arxiv

0+阅读 · 2022年9月30日

Medical Image Understanding with Pretrained Vision Language Models: A Comprehensive Study

Arxiv

0+阅读 · 2022年9月30日

Transfer Learning with Pre-trained Conditional Generative Models

Arxiv

0+阅读 · 2022年9月30日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

TextMatcher: Cross-Attentional Neural Network to Compare Image and Text

Arxiv

1+阅读 · 2022年10月6日

PSENet: Progressive Self-Enhancement Network for Unsupervised Extreme-Light Image Enhancement

Arxiv

0+阅读 · 2022年10月3日

Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data

Arxiv

0+阅读 · 2022年10月2日

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

Arxiv

0+阅读 · 2022年9月30日

Cloud Classification with Unsupervised Deep Learning

Arxiv

0+阅读 · 2022年9月30日

Medical Image Understanding with Pretrained Vision Language Models: A Comprehensive Study

Arxiv

0+阅读 · 2022年9月30日

Transfer Learning with Pre-trained Conditional Generative Models

Arxiv

0+阅读 · 2022年9月30日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

相关基金

次级无轭部双边磁通切换直线电机及其控制系统研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于肿瘤标志物检测的稀土掺杂超小MF2(M=Ca,Sr,Ba)纳米荧光探针及其发光物理

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

同轴相对论返波管中等离子体与波互作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU的相似波形快速检索方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

姿态气动耦合的高超声速飞行器分块建模及鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

共振Schottky探针研制

国家自然科学基金

0+阅读 · 2012年12月31日

转铁蛋白偶联的超顺磁性-荧光纳米探针双重靶向标记脑胶质瘤的研究

国家自然科学基金

0+阅读 · 2011年12月31日

W光子晶体光纤滤波机理与特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

ICF中高能电子和离子输运的Monte-Carlo算法研究和程序研制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员