增强语音和有噪音的语音识别网络 (Speech-enhanced and Noise-aware Networks for Robust Speech Recognition) - 专知论文

会员服务 ·

0

稳健性 · 语音识别 · Networking · 时延神经网络 · 优化器 ·

2022 年 7 月 14 日

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

翻译：增强语音和有噪音的语音识别网络

Hung-Shin Lee,Pin-Yuan Chen,Yao-Fei Cheng,Yu Tsao,Hsin-Min Wang

from arxiv, submitted to ISCSLP 2022

Compensation for channel mismatch and noise interference is essential for robust automatic speech recognition. Enhanced speech has been introduced into the multi-condition training of acoustic models to improve their generalization ability. In this paper, a noise-aware training framework based on two cascaded neural structures is proposed to jointly optimize speech enhancement and speech recognition. The feature enhancement module is composed of a multi-task autoencoder, where noisy speech is decomposed into clean speech and noise. By concatenating its enhanced, noise-aware, and noisy features for each frame, the acoustic-modeling module maps each feature-augmented frame into a triphone state by optimizing the lattice-free maximum mutual information and cross entropy between the predicted and actual state sequences. On top of the factorized time delay neural network (TDNN-F) and its convolutional variant (CNN-TDNNF), both with SpecAug, the two proposed systems achieve word error rate (WER) of 3.90% and 3.55%, respectively, on the Aurora-4 task. Compared with the best existing systems that use bigram and trigram language models for decoding, the proposed CNN-TDNNF-based system achieves a relative WER reduction of 15.20% and 33.53%, respectively. In addition, the proposed CNN-TDNNF-based system also outperforms the baseline CNN-TDNNF system on the AMI task.

翻译：对频道错配和噪声干扰进行补偿是稳健自动语音识别的关键。强化的语音已经引入声学模型的多条件培训,以提高其普及能力。在本文中,基于两个级级神经结构的噪音感知培训框架建议共同优化语音增强和语音识别。功能增强模块由多塔斯克自动coder组成,其中噪音的语音分解成清洁的言词和噪音。通过对每个框架的强化、噪音觉悟和噪音特性进行整合,声学模型将每个地貌放大框架都映射成三声频状态,优化无挂式最大相互信息以及预测和实际状态序列之间的交叉英特质。在因数化时间延迟神经网络(TDNNN-F)及其演变变(CN-TDNNNF)中,与SpecauA-NF系统(WNF-NF)的拟议最大和三基语言模式相比,A-NF系统(CR-NF)将分别缩小15 %。

0

相关内容

稳健性

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

介孔材料受限空间中的AGET ATRP和ARGET ATRP聚合反应

国家自然科学基金

0+阅读 · 2016年12月31日

利用全极化SAR数据反演地表土壤水分方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

多层积雪相干散射模型与SAR参数反演

国家自然科学基金

0+阅读 · 2014年12月31日

基于双指标多等级的土壤重金属生态风险评价方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

薄势垒增强型AlGaN/GaN HEMT及可靠性研究

国家自然科学基金

0+阅读 · 2013年12月31日

微波加热开采低渗透煤层煤层气的逾渗机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

CaCCs/TMEM16A通道离子跨膜输运机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

长白山地森林植被季相变化及预报研究

国家自然科学基金

0+阅读 · 2011年12月31日

Streaming End-to-End Target Speaker ASR

Arxiv

0+阅读 · 2022年9月9日

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

Arxiv

0+阅读 · 2022年9月8日

Robust Self-Ensembling Network for Hyperspectral Image Classification

Arxiv

0+阅读 · 2022年9月7日

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Arxiv

0+阅读 · 2022年9月7日

Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application

Arxiv

0+阅读 · 2022年9月6日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

VIP会员

文章信息

相关主题

时延神经网络

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

Streaming End-to-End Target Speaker ASR

Arxiv

0+阅读 · 2022年9月9日

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

Arxiv

0+阅读 · 2022年9月8日

Robust Self-Ensembling Network for Hyperspectral Image Classification

Arxiv

0+阅读 · 2022年9月7日

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Arxiv

0+阅读 · 2022年9月7日

Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application

Arxiv

0+阅读 · 2022年9月6日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

相关基金

介孔材料受限空间中的AGET ATRP和ARGET ATRP聚合反应

国家自然科学基金

0+阅读 · 2016年12月31日

利用全极化SAR数据反演地表土壤水分方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

多层积雪相干散射模型与SAR参数反演

国家自然科学基金

0+阅读 · 2014年12月31日

基于双指标多等级的土壤重金属生态风险评价方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

薄势垒增强型AlGaN/GaN HEMT及可靠性研究

国家自然科学基金

0+阅读 · 2013年12月31日

微波加热开采低渗透煤层煤层气的逾渗机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

CaCCs/TMEM16A通道离子跨膜输运机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

长白山地森林植被季相变化及预报研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员