《议长尖声化回顾:深学习的最近进展》 (A Review of Speaker Diarization: Recent Advances with Deep Learning) - 专知论文

会员服务 ·

0

学成 · 可辨认的 · surge · 深度学习 · 语音识别 ·

2021 年 1 月 24 日

A Review of Speaker Diarization: Recent Advances with Deep Learning

翻译：《议长尖声化回顾:深学习的最近进展》

Tae Jin Park,Naoyuki Kanda,Dimitrios Dimitriadis,Kyu J. Han,Shinji Watanabe,Shrikanth Narayanan

Speaker diarization is a task to label audio or video recordings with classes corresponding to speaker identity, or in short, a task to identify "who spoke when". In the early years, speaker diarization algorithms were developed for speech recognition on multi-speaker audio recordings to enable speaker adaptive processing, but also gained its own value as a stand-alone application over time to provide speaker-specific meta information for downstream tasks such as audio retrieval. More recently, with the rise of deep learning technology that has been a driving force to revolutionary changes in research and practices across speech application domains in the past decade, more rapid advancements have been made for speaker diarization. In this paper, we review not only the historical development of speaker diarization technology but also the recent advancements in neural speaker diarization approaches. We also discuss how speaker diarization systems have been integrated with speech recognition applications and how the recent surge of deep learning is leading the way of jointly modeling these two components to be complementary to each other. By considering such exciting technical trends, we believe that it is a valuable contribution to the community to provide a survey work by consolidating the recent developments with neural methods and thus facilitating further progress towards a more efficient speaker diarization.

翻译：发言人的diarization 算法在最初几年里不仅为多声音录音的语音识别开发了语言diarization 算法,以使发言者能够适应性处理,而且还在一段时间里取得了其自身的单独应用价值,以便为音频检索等下游任务提供针对发言者的特定元信息。最近,随着深层次学习技术的兴起,成为过去十年来语音应用领域研究和做法革命性变化的驱动力,对发言者的diarization有了更迅速的进展。在本文件中,我们不仅审查了发言者的diarization技术的历史发展,而且审查了神经扩音器diarization方法的最新进展。我们还讨论了发言者的diarization系统如何与语音识别应用程序相结合,以及最近深层学习的激增如何导致联合模拟这两个组成部分,以相互补充。我们考虑到这种令人振奋人心的技术趋势,我们认为,通过以神经扩音器巩固最近的发展,从而促进进一步的进展,对社区提供了宝贵的贡献。

0

相关内容

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

47+阅读 · 2019年12月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Privacy Preservation in Federated Learning: An insightful survey from the GDPR Perspective

Arxiv

2+阅读 · 2021年3月18日

Secure Watermark for Deep Neural Networks with Multi-task Learning

Arxiv

0+阅读 · 2021年3月18日

A Systematic Literature Review and Taxonomy of Modern Code Review

Arxiv

0+阅读 · 2021年3月16日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

50+阅读 · 2020年12月20日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

Deep Learning based Recommender System: A Survey and New Perspectives

Arxiv

6+阅读 · 2017年8月3日

VIP会员

文章信息

相关主题

相关VIP内容

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

47+阅读 · 2019年12月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

相关论文

Privacy Preservation in Federated Learning: An insightful survey from the GDPR Perspective

Arxiv

2+阅读 · 2021年3月18日

Secure Watermark for Deep Neural Networks with Multi-task Learning

Arxiv

0+阅读 · 2021年3月18日

A Systematic Literature Review and Taxonomy of Modern Code Review

Arxiv

0+阅读 · 2021年3月16日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

50+阅读 · 2020年12月20日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

Deep Learning based Recommender System: A Survey and New Perspectives

Arxiv

6+阅读 · 2017年8月3日

微信扫码咨询专知VIP会员