半监督音频识别的联邦自我培训 (Federated Self-Training for Semi-Supervised Audio Recognition) - 专知论文

会员服务 ·

0

未标记 · MoDELS · 学成 · 标注 · 联邦学习 ·

2021 年 7 月 14 日

Federated Self-Training for Semi-Supervised Audio Recognition

翻译：半监督音频识别的联邦自我培训

Vasileios Tsouvalas,Aaqib Saeed,Tanir Ozcelebi

Federated Learning is a distributed machine learning paradigm dealing with decentralized and personal datasets. Since data reside on devices like smartphones and virtual assistants, labeling is entrusted to the clients, or labels are extracted in an automated way. Specifically, in the case of audio data, acquiring semantic annotations can be prohibitively expensive and time-consuming. As a result, an abundance of audio data remains unlabeled and unexploited on users' devices. Most existing federated learning approaches focus on supervised learning without harnessing the unlabeled data. In this work, we study the problem of semi-supervised learning of audio models via self-training in conjunction with federated learning. We propose FedSTAR to exploit large-scale on-device unlabeled data to improve the generalization of audio recognition models. We further demonstrate that self-supervised pre-trained models can accelerate the training of on-device models, significantly improving convergence to within fewer training rounds. We conduct experiments on diverse public audio classification datasets and investigate the performance of our models under varying percentages of labeled and unlabeled data. Notably, we show that with as little as 3% labeled data available, FedSTAR on average can improve the recognition rate by 13.28% compared to the fully supervised federated model.

翻译：联邦学习是一个分布式的机器学习模式,涉及分散和个人数据集。由于数据存在于智能手机和虚拟助理等设备上,因此标签委托给客户,或以自动方式提取标签。具体地说,在音频数据方面,获取语义说明可能极其昂贵和费时。结果,大量的音频数据仍然没有标签,用户设备上也没有开发。大多数现有的联邦学习方法侧重于监督学习,而没有使用未贴标签的数据。在这项工作中,我们研究通过与联邦学习相结合的自我培训,半监督地学习音频模型的问题。我们建议美联储利用大规模非标签数据,改进音频识别模型的通用化。我们进一步证明,自我监督的预先培训模式可以加快对安装模型的培训,大大改进与较少的培训周期的趋同。我们进行了多种公共音频分类数据集的实验,并用不同比例的标签和未贴标签的数据来调查我们模型的绩效。我们建议美联储利用大规模非标签数据来利用大规模非标签的无标签数据来改进音义识别模式的通用。我们用13.28%来将最新数据作为监督的标签化数据识别率,我们可以完全改进了。

0

相关内容

未标记

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知会员服务

89+阅读 · 2020年2月28日

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

专知会员服务

43+阅读 · 2020年2月25日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知

20+阅读 · 2020年2月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

3+阅读 · 2018年11月20日

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

专知

8+阅读 · 2018年6月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

论文共读 | “阳奉阴违”的半监督学习算法 - Virtual Adversarial Training

论文共读 | “阳奉阴违”的半监督学习算法 - Virtual Adversarial Training

PaperWeekly

6+阅读 · 2017年10月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Semi-supervised Contrastive Learning for Label-efficient Medical Image Segmentation

Arxiv

0+阅读 · 2021年9月15日

Self-Training with Differentiable Teacher

Arxiv

0+阅读 · 2021年9月15日

Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding

Arxiv

0+阅读 · 2021年9月14日

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

Arxiv

0+阅读 · 2021年9月10日

Multimodal Federated Learning

Arxiv

0+阅读 · 2021年9月10日

Semi-supervised Relation Extraction via Incremental Meta Self-Training

Arxiv

0+阅读 · 2021年9月10日

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

Arxiv

11+阅读 · 2021年2月18日

S$^\mathbf{4}$L: Self-Supervised Semi-Supervised Learning

Arxiv

5+阅读 · 2019年5月9日

Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets

Arxiv

3+阅读 · 2019年3月24日

Scale Up Event Extraction Learning via Automatic Training Data Generation

Arxiv

7+阅读 · 2017年12月11日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知会员服务

89+阅读 · 2020年2月28日

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

专知会员服务

43+阅读 · 2020年2月25日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

《高能激光武器》22页slides

新书册《几何深度学习的数学基础》

相关资讯

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知

20+阅读 · 2020年2月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

3+阅读 · 2018年11月20日

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

【论文推荐】最新六篇视频分类相关论文—教师学生网络、表观-关系、Charades-Ego、视觉数据合成、图蒸馏、细粒度视频分类

专知

8+阅读 · 2018年6月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

论文共读 | “阳奉阴违”的半监督学习算法 - Virtual Adversarial Training

论文共读 | “阳奉阴违”的半监督学习算法 - Virtual Adversarial Training

PaperWeekly

6+阅读 · 2017年10月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Semi-supervised Contrastive Learning for Label-efficient Medical Image Segmentation

Arxiv

0+阅读 · 2021年9月15日

Self-Training with Differentiable Teacher

Arxiv

0+阅读 · 2021年9月15日

Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding

Arxiv

0+阅读 · 2021年9月14日

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

Arxiv

0+阅读 · 2021年9月10日

Multimodal Federated Learning

Arxiv

0+阅读 · 2021年9月10日

Semi-supervised Relation Extraction via Incremental Meta Self-Training

Arxiv

0+阅读 · 2021年9月10日

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

Arxiv

11+阅读 · 2021年2月18日

S$^\mathbf{4}$L: Self-Supervised Semi-Supervised Learning

Arxiv

5+阅读 · 2019年5月9日

Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets

Arxiv

3+阅读 · 2019年3月24日

Scale Up Event Extraction Learning via Automatic Training Data Generation

Arxiv

7+阅读 · 2017年12月11日

微信扫码咨询专知VIP会员