以创性反向网络为基础的未受控制音频和电话序列的学习电话识别 (Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network) - 专知论文

会员服务 ·

0

Learning · Performer · HMM · 生成式对抗网络 · Better ·

2022 年 7 月 29 日

Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network

翻译：以创性反向网络为基础的未受控制音频和电话序列的学习电话识别

Da-rong Liu,Po-chun Hsu,Yi-chen Chen,Sung-feng Huang,Shun-po Chuang,Da-yi Wu,Hung-yi Lee

ASR has been shown to achieve great performance recently. However, most of them rely on massive paired data, which is not feasible for low-resource languages worldwide. This paper investigates how to learn directly from unpaired phone sequences and speech utterances. We design a two-stage iterative framework. GAN training is adopted in the first stage to find the mapping relationship between unpaired speech and phone sequence. In the second stage, another HMM model is introduced to train from the generator's output, which boosts the performance and provides a better segmentation for the next iteration. In the experiment, we first investigate different choices of model designs. Then we compare the framework to different types of baselines: (i) supervised methods (ii) acoustic unit discovery based methods (iii) methods learning from unpaired data. Our framework performs consistently better than all acoustic unit discovery methods and previous methods learning from unpaired data based on the TIMIT dataset.

翻译：显示ASR最近取得了巨大的成绩。但是, 他们大多依靠大量配对数据, 这对全世界低资源语言是行不通的。本文调查如何直接从未受重视的电话序列和语音表达中学习。我们设计了一个两阶段的迭接框架。在第一阶段, GAN 培训被采用, 以寻找未受重视的语音和电话序列之间的映射关系。在第二阶段, 引入另一个 HMM 模型来从发电机输出中培训, 这会提高性能, 并为下一次迭代提供更好的分隔。在实验中, 我们首先调查模型设计的不同选择。然后我们将框架与不同类型的基线进行比较:( 一) 监督方法 (二) 声音单位发现方法 (三) 从未受重视的数据中学习的方法。我们的框架比所有声学单元发现方法和以前根据TIMEX数据集从未受重视的数据学习的方法都好。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

多复变函数空间上(加权)复合算子的动力学性质

国家自然科学基金

1+阅读 · 2013年12月31日

Yb离子和Ce离子共掺以增强GaN:Er微纳米晶发光性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

有机分子半导体的非局域电声子耦合：声子色散与二阶电声子相互作用的影响

国家自然科学基金

0+阅读 · 2012年12月31日

Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks

Arxiv

0+阅读 · 2022年9月25日

Clustering-Based Representation Learning through Output Translation and Its Application to Remote--Sensing Images

Arxiv

0+阅读 · 2022年9月25日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

生成式对抗网络

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

《北约认知战概念报告》

【MIT博士论文】高效的视觉合成生成模型

美海军放弃星座级转而采用国家安全巡逻舰设计

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks

Arxiv

0+阅读 · 2022年9月25日

Clustering-Based Representation Learning through Output Translation and Its Application to Remote--Sensing Images

Arxiv

0+阅读 · 2022年9月25日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

多复变函数空间上(加权)复合算子的动力学性质

国家自然科学基金

1+阅读 · 2013年12月31日

Yb离子和Ce离子共掺以增强GaN:Er微纳米晶发光性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

有机分子半导体的非局域电声子耦合：声子色散与二阶电声子相互作用的影响

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员