FFNST: 联邦知名学生自动语音识别培训 (FedNST: Federated Noisy Student Training for Automatic Speech Recognition) - 专知论文

会员服务 ·

0

Learning · 语音识别 · 自动语音识别 · 未标记 · MoDELS ·

2022 年 6 月 6 日

FedNST: Federated Noisy Student Training for Automatic Speech Recognition

翻译：FFNST: 联邦知名学生自动语音识别培训

Haaris Mehmood,Agnieszka Dobrowolska,Karthikeyan Saravanan,Mete Ozay

from arxiv, Submitted to Interspeech 2022

Federated Learning (FL) enables training state-of-the-art Automatic Speech Recognition (ASR) models on user devices (clients) in distributed systems, hence preventing transmission of raw user data to a central server. A key challenge facing practical adoption of FL for ASR is obtaining ground-truth labels on the clients. Existing approaches rely on clients to manually transcribe their speech, which is impractical for obtaining large training corpora. A promising alternative is using semi-/self-supervised learning approaches to leverage unlabelled user data. To this end, we propose a new Federated ASR method called FedNST for noisy student training of distributed ASR models with private unlabelled user data. We explore various facets of FedNST , such as training models with different proportions of unlabelled and labelled data, and evaluate the proposed approach on 1173 simulated clients. Evaluating FedNST on LibriSpeech, where 960 hours of speech data is split equally into server (labelled) and client (unlabelled) data, showed a 22.5% relative word error rate reduction (WERR) over a supervised baseline trained only on server data.

翻译：联邦学习联合会(FL)能够对分布式系统中的用户设备(客户)进行最先进的自动语音识别(ASR)模型培训,从而防止将原始用户数据传输到中央服务器。实际采用ASRFL所面临的一项关键挑战是如何获得客户的地面真相标签。现有办法依靠客户手工改写其演讲稿,这对于获得大型培训公司来说是不切实际的。一个有希望的替代办法是使用半/自监督的学习方法来利用未贴标签用户数据。为此,我们提议采用新的FedNST方法,用私人无标签用户数据对已分发的ASR模型进行吵闹学生培训。我们探索FedNST的不同方面,例如使用不同比例的无标签和贴标签数据的培训模型,并评价对1 173名模拟客户的拟议办法。在LibriSpeech上对FedNST进行了评估,其中960小时的语音数据均分为服务器(标签)和客户(未贴标签)数据,显示比仅经监督的服务器数据减少22.5%的相对字差率。

0

相关内容

Learning

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知会员服务

89+阅读 · 2020年2月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

套代数框架下时变线性系统的同时稳定化

国家自然科学基金

0+阅读 · 2015年12月31日

TRAP1在赭曲霉毒素A干扰肾细胞凋亡与自噬内稳态中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Fe3O4/SiO2/MOF磁性多孔材料的构筑及对酚类分子的吸附机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

大尺寸YCOB晶体包裹体形成机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Find-me和Eat-me信号在NOD.H-2h4 小鼠自身免疫甲状腺炎发病机制中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

URG11介导缺氧导致的肾小管上皮细胞转分化和肾脏纤维化的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

ZnO薄膜材料生长与器件研制

国家自然科学基金

0+阅读 · 2008年12月31日

用于Mach-Zehnder电光调制器芯层的超支化无机－有机杂化材料

国家自然科学基金

0+阅读 · 2008年12月31日

Federated Graph Contrastive Learning

Arxiv

0+阅读 · 2022年7月24日

Fast strategies for multi-temporal speckle reduction of Sentinel-1 GRD images

Arxiv

0+阅读 · 2022年7月22日

Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence

Arxiv

0+阅读 · 2022年7月22日

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM

Arxiv

0+阅读 · 2022年7月22日

Learning Energy-Based Models With Adversarial Training

Arxiv

0+阅读 · 2022年7月21日

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Arxiv

0+阅读 · 2022年7月21日

How Well Does Self-Supervised Pre-Training Perform with Streaming Data?

Arxiv

0+阅读 · 2022年7月21日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Advances and Open Problems in Federated Learning

Advances and Open Problems in Federated Learning

Arxiv

18+阅读 · 2019年12月10日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知会员服务

89+阅读 · 2020年2月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

兵棋系统文档：联合战区级模拟-全球行动（JTLS-GO®）

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

从无人机到数据：揭示边缘计算作为新作战域

综述：机器嗅觉与嵌入式人工智能正在塑造新的全球传感产业

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Federated Graph Contrastive Learning

Arxiv

0+阅读 · 2022年7月24日

Fast strategies for multi-temporal speckle reduction of Sentinel-1 GRD images

Arxiv

0+阅读 · 2022年7月22日

Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence

Arxiv

0+阅读 · 2022年7月22日

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM

Arxiv

0+阅读 · 2022年7月22日

Learning Energy-Based Models With Adversarial Training

Arxiv

0+阅读 · 2022年7月21日

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Arxiv

0+阅读 · 2022年7月21日

How Well Does Self-Supervised Pre-Training Perform with Streaming Data?

Arxiv

0+阅读 · 2022年7月21日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Advances and Open Problems in Federated Learning

Advances and Open Problems in Federated Learning

Arxiv

18+阅读 · 2019年12月10日

相关基金

套代数框架下时变线性系统的同时稳定化

国家自然科学基金

0+阅读 · 2015年12月31日

TRAP1在赭曲霉毒素A干扰肾细胞凋亡与自噬内稳态中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Fe3O4/SiO2/MOF磁性多孔材料的构筑及对酚类分子的吸附机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

大尺寸YCOB晶体包裹体形成机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Find-me和Eat-me信号在NOD.H-2h4 小鼠自身免疫甲状腺炎发病机制中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

URG11介导缺氧导致的肾小管上皮细胞转分化和肾脏纤维化的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

ZnO薄膜材料生长与器件研制

国家自然科学基金

0+阅读 · 2008年12月31日

用于Mach-Zehnder电光调制器芯层的超支化无机－有机杂化材料

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员