将语音承认、脱verferation、波形和自上至上学习代表制的最后到最后的整合 (End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation) - 专知论文

会员服务 ·

0

Integration · 语音识别 · 端到端 · Learning · Performer ·

2022 年 10 月 19 日

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation

翻译：将语音承认、脱verferation、波形和自上至上学习代表制的最后到最后的整合

Yoshiki Masuyama,Xuankai Chang,Samuele Cornell,Shinji Watanabe,Nobutaka Ono

from arxiv, Accepted to IEEE SLT 2022

Self-supervised learning representation (SSLR) has demonstrated its significant effectiveness in automatic speech recognition (ASR), mainly with clean speech. Recent work pointed out the strength of integrating SSLR with single-channel speech enhancement for ASR in noisy environments. This paper further advances this integration by dealing with multi-channel input. We propose a novel end-to-end architecture by integrating dereverberation, beamforming, SSLR, and ASR within a single neural network. Our system achieves the best performance reported in the literature on the CHiME-4 6-channel track with a word error rate (WER) of 1.77%. While the WavLM-based strong SSLR demonstrates promising results by itself, the end-to-end integration with the weighted power minimization distortionless response beamformer, which simultaneously performs dereverberation and denoising, improves WER significantly. Its effectiveness is also validated on the REVERB dataset.

翻译：自监学习代表(SSLR)在自动语音识别(ASR)方面显示了其显著的实效,主要是干净的言语。最近的工作指出了将SSLR与ASR在吵闹环境中的单一频道语音增强相结合的力度。本文件通过处理多频道输入进一步推进了这种整合。我们提议在单一神经网络中整合一个新型端对端结构,将皮肤变异、波形变形、SSLR和ASR纳入一个神经网络。我们的系统实现了CHime-4 6频道轨道文献中报告的最佳性能,单词误差率为1.77%。WavLM的强大SSLR本身显示了有希望的结果,但端对端的整合与加权电源最小化的无扭曲性反应是前导的,同时进行脱色变和脱色,显著改善WER。它的效力也在RWEWER数据集上得到验证。

0

相关内容

Integration

Integration：Integration, the VLSI Journal。 Explanation：集成，VLSI杂志。 Publisher：Elsevier。 SIT：http://dblp.uni-trier.de/db/journals/integration/

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

高效、柔性钙钛矿型太阳能电池制备及性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Serglycin调控TGF-β信号通路诱导EMT促进膀胱癌转移机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

机体锌稳衡在锌抗肉鸡沙门氏菌感染的作用及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

BiFeO3/TiO2纳米管异质结光降解典型PFCs分子的特性与机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

螺旋二芴的化学掺杂及高效柔性固态太阳能电池的制备研究

国家自然科学基金

0+阅读 · 2012年12月31日

表面等离激元增强宽光谱InGaN太阳能电池研究

国家自然科学基金

0+阅读 · 2012年12月31日

染料敏化太阳电池中准固态陷光电解质的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型高电位锂离子电池正极材料的界面电化学分析

国家自然科学基金

0+阅读 · 2011年12月31日

高有序有机本体异质结太阳能电池的制备与研究

国家自然科学基金

0+阅读 · 2009年12月31日

CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars

Arxiv

0+阅读 · 2022年12月1日

Dial-a-ride problem with modular platooning and en-route transfers

Arxiv

0+阅读 · 2022年12月1日

Self-Supervised Continual Graph Learning in Adaptive Riemannian Spaces

Arxiv

7+阅读 · 2022年11月30日

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Arxiv

0+阅读 · 2022年11月30日

Learning with Partial Labels from Semi-supervised Perspective

Arxiv

0+阅读 · 2022年11月30日

Automatic Discovery of Multi-perspective Process Model using Reinforcement Learning

Arxiv

0+阅读 · 2022年11月30日

Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation

Arxiv

15+阅读 · 2021年1月21日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Arxiv

15+阅读 · 2020年3月26日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars

Arxiv

0+阅读 · 2022年12月1日

Dial-a-ride problem with modular platooning and en-route transfers

Arxiv

0+阅读 · 2022年12月1日

Self-Supervised Continual Graph Learning in Adaptive Riemannian Spaces

Arxiv

7+阅读 · 2022年11月30日

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Arxiv

0+阅读 · 2022年11月30日

Learning with Partial Labels from Semi-supervised Perspective

Arxiv

0+阅读 · 2022年11月30日

Automatic Discovery of Multi-perspective Process Model using Reinforcement Learning

Arxiv

0+阅读 · 2022年11月30日

Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation

Arxiv

15+阅读 · 2021年1月21日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Arxiv

15+阅读 · 2020年3月26日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

高效、柔性钙钛矿型太阳能电池制备及性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Serglycin调控TGF-β信号通路诱导EMT促进膀胱癌转移机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

机体锌稳衡在锌抗肉鸡沙门氏菌感染的作用及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

BiFeO3/TiO2纳米管异质结光降解典型PFCs分子的特性与机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

螺旋二芴的化学掺杂及高效柔性固态太阳能电池的制备研究

国家自然科学基金

0+阅读 · 2012年12月31日

表面等离激元增强宽光谱InGaN太阳能电池研究

国家自然科学基金

0+阅读 · 2012年12月31日

染料敏化太阳电池中准固态陷光电解质的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型高电位锂离子电池正极材料的界面电化学分析

国家自然科学基金

0+阅读 · 2011年12月31日

高有序有机本体异质结太阳能电池的制备与研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员