自我监督的演讲代表学习:审查 (Self-Supervised Speech Representation Learning: A Review) - 专知论文

会员服务 ·

0

Learning · 表示学习 · 可约的 · 表示 · Processing（编程语言） ·

2022 年 6 月 15 日

Self-Supervised Speech Representation Learning: A Review

翻译：自我监督的演讲代表学习:审查

Abdelrahman Mohamed,Hung-yi Lee,Lasse Borgholt,Jakob D. Havtorn,Joakim Edin,Christian Igel,Katrin Kirchhoff,Shang-Wen Li,Karen Livescu,Lars Maaløe,Tara N. Sainath,Shinji Watanabe

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and languages for which only limited labeled data is available. Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. Such methods have shown success in natural language processing and computer vision domains, achieving new levels of performance while reducing the number of labels required for many downstream scenarios. Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods. Other approaches rely on multi-modal data for pre-training, mixing text or visual data streams with speech. Although self-supervised speech representation is still a nascent research area, it is closely related to acoustic word embedding and learning with zero lexical resources, both of which have seen active research for many years. This review presents approaches for self-supervised speech representation learning and their connection to other research areas. Since many current methods focus solely on automatic speech recognition as a downstream task, we review recent efforts on benchmarking learned representations to extend the application beyond speech recognition.

翻译：虽然经过监督的深层学习使语音和音频处理革命了,但有必要为个别任务和应用设想方案建立专家模型,同样难以将这一模型应用于只有有限标签数据的方言和语言;自我监督的代表学习方法承诺采用单一的普遍模式,有利于各种各样的任务和领域;这些方法在自然语言处理和计算机视觉领域取得了成功,实现了新的业绩水平,同时减少了许多下游情景所需的标签数量;语音代表学习在三个主要类别(基因化、对比和预测方法)上也取得了类似进展;其他方法依赖多种模式数据,用于培训前、文本或视觉数据流与语音的混合;虽然自我监督的语音代表仍然是新生研究领域,但与用零词汇资源嵌入和学习的声音词密切相关,两者多年来都进行了积极研究;本审查提出了自我监督的语音代表学习方法及其与其他研究领域的联系。由于许多现行方法仅将自动语音识别作为下游任务,因此我们审查最近关于将学习的表述基准工作的努力,将应用扩大到语音识别之外。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

压电型半导体微纳结构界面光电转换及其调控机理的原位透射电镜研究

国家自然科学基金

0+阅读 · 2014年12月31日

高频外辐射源雷达基础理论与关键技术

国家自然科学基金

5+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

层状前驱体法制备多级结构高分散合金催化剂的构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

拓扑绝缘体自旋输运性质的研究

国家自然科学基金

0+阅读 · 2012年12月31日

荷载-湿度-温度耦合效应下黄土崩塌灾害形成的机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

草鱼肠道抗菌肽A基因转录调控的分子机理

国家自然科学基金

0+阅读 · 2011年12月31日

航天器整体隔振系统多维隔振机构创新设计及半主动控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ni3Al基合金单晶生长规律研究

国家自然科学基金

0+阅读 · 2009年12月31日

激光熔覆WC－FeNiCr无磁硬质合金磁性能影响因素及机理

国家自然科学基金

0+阅读 · 2009年12月31日

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Arxiv

28+阅读 · 2022年6月8日

Self-Supervised Learning for Recommender Systems: A Survey

Arxiv

12+阅读 · 2022年3月29日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

A Survey on Green Deep Learning

Arxiv

10+阅读 · 2021年11月10日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Graph Learning: A Survey

Arxiv

58+阅读 · 2021年5月3日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

Arxiv

78+阅读 · 2019年11月10日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Arxiv

28+阅读 · 2022年6月8日

Self-Supervised Learning for Recommender Systems: A Survey

Arxiv

12+阅读 · 2022年3月29日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

A Survey on Green Deep Learning

Arxiv

10+阅读 · 2021年11月10日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Graph Learning: A Survey

Arxiv

58+阅读 · 2021年5月3日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

Arxiv

78+阅读 · 2019年11月10日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

相关基金

压电型半导体微纳结构界面光电转换及其调控机理的原位透射电镜研究

国家自然科学基金

0+阅读 · 2014年12月31日

高频外辐射源雷达基础理论与关键技术

国家自然科学基金

5+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

层状前驱体法制备多级结构高分散合金催化剂的构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

拓扑绝缘体自旋输运性质的研究

国家自然科学基金

0+阅读 · 2012年12月31日

荷载-湿度-温度耦合效应下黄土崩塌灾害形成的机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

草鱼肠道抗菌肽A基因转录调控的分子机理

国家自然科学基金

0+阅读 · 2011年12月31日

航天器整体隔振系统多维隔振机构创新设计及半主动控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ni3Al基合金单晶生长规律研究

国家自然科学基金

0+阅读 · 2009年12月31日

激光熔覆WC－FeNiCr无磁硬质合金磁性能影响因素及机理

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员