自主学习语音识别的代表性子集选择以实现高效微调 (Representative Subset Selection for Efficient Fine-Tuning in Self-Supervised Speech Recognition) - 专知论文

会员服务 ·

0

语音识别 · 识别 · 微调 · 监督 · 识别模型 ·

2023 年 4 月 11 日

Representative Subset Selection for Efficient Fine-Tuning in Self-Supervised Speech Recognition

翻译：自主学习语音识别的代表性子集选择以实现高效微调

Abdul Hameed Azeemi,Ihsan Ayyub Qazi,Agha Ali Raza

from arxiv, 16 pages, 8 figures

Self-supervised speech recognition models require considerable labeled training data for learning high-fidelity representations for Automatic Speech Recognition (ASR) which is computationally demanding and time-consuming. We consider the task of identifying an optimal subset of data for efficient fine-tuning in self-supervised speech models for ASR. We discover that the dataset pruning strategies used in vision tasks for sampling the most informative examples do not perform better than random subset selection on fine-tuning self-supervised ASR. We then present the COWERAGE algorithm for representative subset selection in self-supervised ASR. COWERAGE is based on our finding that ensuring the coverage of examples based on training Word Error Rate (WER) in the early training epochs leads to better generalization performance. Extensive experiments with the wav2vec 2.0 and HuBERT model on TIMIT, Librispeech, and LJSpeech datasets show the effectiveness of COWERAGE and its transferability across models, with up to 17% relative WER improvement over existing dataset pruning methods and random sampling. We also demonstrate that the coverage of training instances in terms of WER values ensures the inclusion of phonemically diverse examples, leading to better test accuracy in self-supervised speech recognition models.

翻译：自我监督的语音识别模型需要大量标记的训练数据来学习实现自动语音识别（ASR)的高保真表示，这需要大量计算和耗费时间。我们考虑在自我监督语音模型进行高效微调时，识别一个最佳数据子集的任务。我们发现，用于在视觉任务中对最具信息量的示例进行采样的数据集剪枝策略，在自监督 ASR 的微调上效果不如随机子集选择。然后，我们提出了用于自我监督 ASR 的代表性子集选择的 COWERAGE 算法。COWERAGE 基于我们的发现，即在早期的训练时期，确保包含训练 Word Error Rate (WER) 的示例覆盖范围可以获得更好的泛化性能。在 TIMIT、Librispeech 和 LJSpeech 数据集上的 wav2vec 2.0 和 HuBERT 模型的大量实验表明了 COWERAGE 的有效性及其跨模型的可转移性，相对于现有的数据集剪枝方法和随机抽样方式，可以实现最多 17% 的相对 WER 改善。我们还证明了在基于 WER 值的训练实例中确保覆盖可以包含音素多样性示例，从而在自我监督语音识别模型中获得更好的测试精度。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

【NAACL2022】自然语言处理的对比数据与学习

【NAACL2022】自然语言处理的对比数据与学习

专知会员服务

46+阅读 · 2022年7月10日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

专知会员服务

48+阅读 · 2020年4月13日

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

专知会员服务

13+阅读 · 2020年3月27日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【AAAI2020接受论文】多任务自监督学习的不流利检测，Multi-Task Self-Supervised Learning for Disfluency Detection

【AAAI2020接受论文】多任务自监督学习的不流利检测，Multi-Task Self-Supervised Learning for Disfluency Detection

专知会员服务

14+阅读 · 2019年11月11日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

论文浅尝 | Continual Learning for Named Entity Recognition

论文浅尝 | Continual Learning for Named Entity Recognition

开放知识图谱

1+阅读 · 2022年6月25日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

分布式混合SAR/ISAR对复杂运动舰船目标成像关键技术与新方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

对具有非平衡多标签特性的蛋白质功能类型分类预测研究

国家自然科学基金

0+阅读 · 2014年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

核函数优化选择的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

棉花中一个成花素同源基因GhFTL1调节开花的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

融合分割与智能聚类的铁谱图像处理及其评价体系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

整合猪miRNA和功能基因表达谱芯片元数据挖掘肌肉生长发育新的调控通路

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the Efficacy of Generalization Error Prediction Scoring Functions

On the Efficacy of Generalization Error Prediction Scoring Functions

Arxiv

0+阅读 · 2023年5月29日

Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning

Arxiv

0+阅读 · 2023年5月29日

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation

Arxiv

0+阅读 · 2023年5月28日

DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models

Arxiv

0+阅读 · 2023年5月28日

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

Arxiv

0+阅读 · 2023年5月26日

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

Arxiv

0+阅读 · 2023年5月24日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Arxiv

12+阅读 · 2020年6月24日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

【NAACL2022】自然语言处理的对比数据与学习

【NAACL2022】自然语言处理的对比数据与学习

专知会员服务

46+阅读 · 2022年7月10日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

专知会员服务

48+阅读 · 2020年4月13日

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

专知会员服务

13+阅读 · 2020年3月27日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【AAAI2020接受论文】多任务自监督学习的不流利检测，Multi-Task Self-Supervised Learning for Disfluency Detection

【AAAI2020接受论文】多任务自监督学习的不流利检测，Multi-Task Self-Supervised Learning for Disfluency Detection

专知会员服务

14+阅读 · 2019年11月11日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

论文浅尝 | Continual Learning for Named Entity Recognition

论文浅尝 | Continual Learning for Named Entity Recognition

开放知识图谱

1+阅读 · 2022年6月25日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

相关论文

On the Efficacy of Generalization Error Prediction Scoring Functions

On the Efficacy of Generalization Error Prediction Scoring Functions

Arxiv

0+阅读 · 2023年5月29日

Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning

Arxiv

0+阅读 · 2023年5月29日

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation

Arxiv

0+阅读 · 2023年5月28日

DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models

Arxiv

0+阅读 · 2023年5月28日

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

Arxiv

0+阅读 · 2023年5月26日

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

Arxiv

0+阅读 · 2023年5月24日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Arxiv

12+阅读 · 2020年6月24日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

分布式混合SAR/ISAR对复杂运动舰船目标成像关键技术与新方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

对具有非平衡多标签特性的蛋白质功能类型分类预测研究

国家自然科学基金

0+阅读 · 2014年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

核函数优化选择的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

棉花中一个成花素同源基因GhFTL1调节开花的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

融合分割与智能聚类的铁谱图像处理及其评价体系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

整合猪miRNA和功能基因表达谱芯片元数据挖掘肌肉生长发育新的调控通路

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员