分析自动语音识别端至端神经模型的强度 (Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition) - 专知论文

会员服务 ·

0

稳健性 · MoDELS · Analysis · 噪声 · 语音识别 ·

2022 年 8 月 17 日

Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition

翻译：分析自动语音识别端至端神经模型的强度

Goutham Rajendran,Wei Zou

from arxiv, 5 pages, 14 figures

We investigate robustness properties of pre-trained neural models for automatic speech recognition. Real life data in machine learning is usually very noisy and almost never clean, which can be attributed to various factors depending on the domain, e.g. outliers, random noise and adversarial noise. Therefore, the models we develop for various tasks should be robust to such kinds of noisy data, which led to the thriving field of robust machine learning. We consider this important issue in the setting of automatic speech recognition. With the increasing popularity of pre-trained models, it's an important question to analyze and understand the robustness of such models to noise. In this work, we perform a robustness analysis of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on the LibriSpeech and TIMIT datasets. We use different kinds of noising mechanisms and measure the model performances as quantified by the inference time and the standard Word Error Rate metric. We also do an in-depth layer-wise analysis of the wav2vec2 model when injecting noise in between layers, enabling us to predict at a high level what each layer learns. Finally for this model, we visualize the propagation of errors across the layers and compare how it behaves on clean versus noisy data. Our experiments conform the predictions of Pasad et al. [2021] and also raise interesting directions for future work.

翻译：我们调查了预先训练的神经模型的稳健性能,以便自动语音识别。在机器学习中,真实的生活数据通常非常吵闹,而且几乎从来不干净,这可以归因于不同领域的各种因素,例如外部线、随机噪音和对立噪音。因此,我们为各种任务开发的模式应当对此类吵动数据具有很强的特性,从而导致强有力的机器学习的蓬勃领域。我们在自动语音识别的设置中考虑到这一重要问题。随着预先训练的模型越来越受欢迎,分析和理解这类模型对噪音的稳健性能是一个重要问题。在这项工作中,我们对预先训练的神经模型Wav2vec2、HuBERT和DistilHuBERT进行稳健性分析,从而使我们能够在LibriSpech和TIMIT数据集中预测高水平的神经模型和DistilHERT进行预测。我们使用不同的消化机制并衡量模型的性能,用推导时间和标准Word错误度度度度衡量。我们还从层对在两个层次之间注入噪音时的 wav2c2模型进行深入的层次分析。我们能够预测未来如何在高层次和高层次上预测我们如何在高层次上预测。我们如何在高层次上对每一层次上如何进行更精确地研究。

0

相关内容

稳健性

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

大扰动下不可压缩Navier-Stokes方程的稳定性态

国家自然科学基金

0+阅读 · 2015年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

全空间中临界Surface Quasi-geostrophic方程的全局吸引子及其分形维数

国家自然科学基金

0+阅读 · 2014年12月31日

EV71病毒感染介导Sam68调控PI3K/AKT信号通路的分子机制

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

玉米耐旱性的分子遗传基础解析

国家自然科学基金

0+阅读 · 2013年12月31日

鼻咽癌3号染色体中新型抑癌基因的鉴定与功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

丙酮酸磷酸双激酶对玉米C4光合作用的影响与调控

国家自然科学基金

0+阅读 · 2012年12月31日

高空大气湍流相干结构探测与研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国9- - 18岁城市学生攻击行为评定常模研制及攻击个体社会认知的fMRI研究

国家自然科学基金

0+阅读 · 2009年12月31日

SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data

Arxiv

0+阅读 · 2022年10月6日

On the Robustness of Deep Clustering Models: Adversarial Attacks and Defenses

Arxiv

0+阅读 · 2022年10月4日

Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning

Arxiv

0+阅读 · 2022年10月4日

The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels

Arxiv

0+阅读 · 2022年10月2日

Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers

Arxiv

0+阅读 · 2022年10月1日

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Arxiv

0+阅读 · 2022年9月30日

End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning

Arxiv

1+阅读 · 2022年9月30日

Transfer Learning with Pre-trained Conditional Generative Models

Arxiv

0+阅读 · 2022年9月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data

Arxiv

0+阅读 · 2022年10月6日

On the Robustness of Deep Clustering Models: Adversarial Attacks and Defenses

Arxiv

0+阅读 · 2022年10月4日

Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning

Arxiv

0+阅读 · 2022年10月4日

The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels

Arxiv

0+阅读 · 2022年10月2日

Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers

Arxiv

0+阅读 · 2022年10月1日

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Arxiv

0+阅读 · 2022年9月30日

End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning

Arxiv

1+阅读 · 2022年9月30日

Transfer Learning with Pre-trained Conditional Generative Models

Arxiv

0+阅读 · 2022年9月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

大扰动下不可压缩Navier-Stokes方程的稳定性态

国家自然科学基金

0+阅读 · 2015年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

全空间中临界Surface Quasi-geostrophic方程的全局吸引子及其分形维数

国家自然科学基金

0+阅读 · 2014年12月31日

EV71病毒感染介导Sam68调控PI3K/AKT信号通路的分子机制

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

玉米耐旱性的分子遗传基础解析

国家自然科学基金

0+阅读 · 2013年12月31日

鼻咽癌3号染色体中新型抑癌基因的鉴定与功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

丙酮酸磷酸双激酶对玉米C4光合作用的影响与调控

国家自然科学基金

0+阅读 · 2012年12月31日

高空大气湍流相干结构探测与研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国9- - 18岁城市学生攻击行为评定常模研制及攻击个体社会认知的fMRI研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员