单一个发言者是自动语音识别所需的几乎全部 (A single speaker is almost all you need for automatic speech recognition) - 专知论文

会员服务 ·

0

语音识别 · 自动语音识别 · Extensibility · 数据增强 · 语音合成 ·

2022 年 8 月 31 日

A single speaker is almost all you need for automatic speech recognition

翻译：单一个发言者是自动语音识别所需的几乎全部

Edresson Casanova,Christopher Shulby,Arnaldo Candido Junior,Sandra Aluísio,Moacir Antonelli Ponti

from arxiv, The paper is under consideration at IEEE Signal Processing Letters

We explore cross-lingual multi-speaker speech synthesis and cross-lingual voice conversion applied to data augmentation for automatic speech recognition (ASR) systems. Through extensive experiments, we show that our approach permits the application of speech synthesis and voice conversion to improve ASR systems on a target language using only one target-language speaker during model training. We managed to close the gap between ASR models trained with synthesized versus human speech compared to other works that use many speakers. Finally, we show that it is possible to obtain promising ASR training results with our data augmentation method using only a single real speaker in a target language.

翻译：我们探索跨语言多语种语音合成和跨语言语音转换,用于自动语音识别系统的数据增强;通过广泛实验,我们表明,我们的方法允许应用语言合成和语音转换,在示范培训中只使用一名目标语言演讲者,改进目标语言上的ASR系统;我们设法缩小了经过合成语言培训的ASR模型与使用许多发言者的其他作品之间的差距;最后,我们表明,利用数据增强方法,仅使用一个目标语言的单一真正演讲者,就有可能获得有希望的ASR培训结果。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

50+阅读 · 2022年10月2日

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

专知会员服务

33+阅读 · 2022年3月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

TV-miR-200b/c靶向抑制HER2/HER3克服乳腺癌对赫赛汀耐药

国家自然科学基金

0+阅读 · 2014年12月31日

水生克隆植物的表型可塑性及表观遗传学机制

国家自然科学基金

0+阅读 · 2014年12月31日

番茄乙烯信号转录因子LeERF1相关small RNAs的分离及其调控机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

水稻株型发育遗传调控网络的解析

国家自然科学基金

0+阅读 · 2013年12月31日

哺乳动物体细胞克隆胚胎抗氧化机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

PET/SPECT同机同时成像方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

能动磨盘加工大口径离轴非球面反射镜的关键技术

国家自然科学基金

0+阅读 · 2011年12月31日

GmMADS1在大豆花发育中的调控机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

数控系统嵌入式实施中的关键问题及协同设计方法研究

国家自然科学基金

1+阅读 · 2008年12月31日

Language-agnostic Code-Switching in End-To-End Speech Recognition

Arxiv

0+阅读 · 2022年10月17日

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Arxiv

0+阅读 · 2022年10月17日

A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR

Arxiv

1+阅读 · 2022年10月16日

Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations

Arxiv

0+阅读 · 2022年10月15日

LEATHER: A Framework for Learning to Generate Human-like Text in Dialogue

Arxiv

0+阅读 · 2022年10月14日

Is synthetic data from generative models ready for image recognition?

Arxiv

0+阅读 · 2022年10月14日

JOIST: A Joint Speech and Text Streaming Model For ASR

Arxiv

0+阅读 · 2022年10月13日

HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning

Arxiv

0+阅读 · 2022年10月13日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

50+阅读 · 2022年10月2日

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

专知会员服务

33+阅读 · 2022年3月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Language-agnostic Code-Switching in End-To-End Speech Recognition

Arxiv

0+阅读 · 2022年10月17日

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Arxiv

0+阅读 · 2022年10月17日

A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR

Arxiv

1+阅读 · 2022年10月16日

Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations

Arxiv

0+阅读 · 2022年10月15日

LEATHER: A Framework for Learning to Generate Human-like Text in Dialogue

Arxiv

0+阅读 · 2022年10月14日

Is synthetic data from generative models ready for image recognition?

Arxiv

0+阅读 · 2022年10月14日

JOIST: A Joint Speech and Text Streaming Model For ASR

Arxiv

0+阅读 · 2022年10月13日

HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning

Arxiv

0+阅读 · 2022年10月13日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

相关基金

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

TV-miR-200b/c靶向抑制HER2/HER3克服乳腺癌对赫赛汀耐药

国家自然科学基金

0+阅读 · 2014年12月31日

水生克隆植物的表型可塑性及表观遗传学机制

国家自然科学基金

0+阅读 · 2014年12月31日

番茄乙烯信号转录因子LeERF1相关small RNAs的分离及其调控机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

水稻株型发育遗传调控网络的解析

国家自然科学基金

0+阅读 · 2013年12月31日

哺乳动物体细胞克隆胚胎抗氧化机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

PET/SPECT同机同时成像方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

能动磨盘加工大口径离轴非球面反射镜的关键技术

国家自然科学基金

0+阅读 · 2011年12月31日

GmMADS1在大豆花发育中的调控机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

数控系统嵌入式实施中的关键问题及协同设计方法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员