低语言在线发言人与基于图表的标签代的 Diarization (Low-Latency Online Speaker Diarization with Graph-Based Label Generation) - 专知论文

会员服务 ·

0

标注 · 在线 · Performer · 簇 · Performance ·

2022 年 6 月 24 日

Low-Latency Online Speaker Diarization with Graph-Based Label Generation

翻译：低语言在线发言人与基于图表的标签代的 Diarization

Yucong Zhang,Qinjian Lin,Weiqing Wang,Lin Yang,Xuyang Wang,Junjie Wang,Ming Li

from arxiv, accepted by Odyssey 2022

This paper introduces an online speaker diarization system that can handle long-time audio with low latency. We enable Agglomerative Hierarchy Clustering (AHC) to work in an online fashion by introducing a label matching algorithm. This algorithm solves the inconsistency between output labels and hidden labels that are generated each turn. To ensure the low latency in the online setting, we introduce a variant of AHC, namely chkpt-AHC, to cluster the speakers. In addition, we propose a speaker embedding graph to exploit a graph-based re-clustering method, further improving the performance. In the experiment, we evaluate our systems on both DIHARD3 and VoxConverse datasets. The experimental results show that our proposed online systems have better performance than our baseline online system and have comparable performance to our offline systems. We find out that the framework combining the chkpt-AHC method and the label matching algorithm works well in the online setting. Moreover, the chkpt-AHC method greatly reduces the time cost, while the graph-based re-clustering method helps improve the performance.

翻译：本文介绍一个可以处理长期低延迟音频的在线语音分解系统。我们通过引入标签匹配算法, 使集压式等级分组(AHC)能够以在线方式工作。这个算法可以解决输出标签和每转产生的隐藏标签之间的不一致性。为了确保在线环境的低延迟性, 我们引入了一个AHC的变种, 即chkpt- AHC, 来分组演讲者。此外, 我们提议用一个以图表为基础的重新分组方法嵌入一个发言者图, 以进一步改进性能。在实验中, 我们评估了我们的 DISARD3 和 VoxConversion 数据集系统。实验结果显示, 我们提议的在线系统比基线在线系统有更好的性能, 并且与我们的离线系统有类似的性能。我们发现, 将 chkpt- AHC 方法和标签匹配算法结合的框架在在线环境中效果很好。此外, chkpt- AHC 方法极大地降低了时间成本, 而基于图形的重新分组方法有助于改善业绩。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

亲水性氨基酸离子液体吸收CO2的传质-反应机理

国家自然科学基金

0+阅读 · 2014年12月31日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

固定化离子液体调控酶催化甘油解反应产物组成的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

非线性离散可积方程与离散Painlevé方程族的连续极限理论

国家自然科学基金

0+阅读 · 2013年12月31日

高稳定性银基复合光催化剂的制备及可见光催化降解有机污染物的研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化离子液体催化转换CO2合成含氮杂环化合物

国家自然科学基金

0+阅读 · 2012年12月31日

激光熔覆含硼BCC结构高熵合金涂层强韧化机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

CeO2催化CO2和甲醇合成碳酸二甲酯构效关系和反应机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

聚合物/富勒烯太阳能电池光活性层形貌的超分子调控及稳定化

国家自然科学基金

0+阅读 · 2011年12月31日

Semi-supervised Human Pose Estimation in Art-historical Images

Arxiv

0+阅读 · 2022年8月15日

Doubly Robust Estimation under Covariate-induced Dependent Left Truncation

Arxiv

0+阅读 · 2022年8月14日

Heterogeneous Line Graph Transformer for Math Word Problems

Arxiv

1+阅读 · 2022年8月12日

Domain-Specific Text Generation for Machine Translation

Domain-Specific Text Generation for Machine Translation

Arxiv

0+阅读 · 2022年8月11日

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

Arxiv

0+阅读 · 2022年8月11日

Multi-Agent Reinforcement Learning with Graph Convolutional Neural Networks for optimal Bidding Strategies of Generation Units in Electricity Markets

Arxiv

0+阅读 · 2022年8月11日

Solving MathWord Problems Automatically with Heterogeneous Line Graph Transformer for Online Learning

Arxiv

0+阅读 · 2022年8月11日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

Semi-supervised Human Pose Estimation in Art-historical Images

Arxiv

0+阅读 · 2022年8月15日

Doubly Robust Estimation under Covariate-induced Dependent Left Truncation

Arxiv

0+阅读 · 2022年8月14日

Heterogeneous Line Graph Transformer for Math Word Problems

Arxiv

1+阅读 · 2022年8月12日

Domain-Specific Text Generation for Machine Translation

Domain-Specific Text Generation for Machine Translation

Arxiv

0+阅读 · 2022年8月11日

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

Arxiv

0+阅读 · 2022年8月11日

Multi-Agent Reinforcement Learning with Graph Convolutional Neural Networks for optimal Bidding Strategies of Generation Units in Electricity Markets

Arxiv

0+阅读 · 2022年8月11日

Solving MathWord Problems Automatically with Heterogeneous Line Graph Transformer for Online Learning

Arxiv

0+阅读 · 2022年8月11日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

相关基金

亲水性氨基酸离子液体吸收CO2的传质-反应机理

国家自然科学基金

0+阅读 · 2014年12月31日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

固定化离子液体调控酶催化甘油解反应产物组成的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

非线性离散可积方程与离散Painlevé方程族的连续极限理论

国家自然科学基金

0+阅读 · 2013年12月31日

高稳定性银基复合光催化剂的制备及可见光催化降解有机污染物的研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化离子液体催化转换CO2合成含氮杂环化合物

国家自然科学基金

0+阅读 · 2012年12月31日

激光熔覆含硼BCC结构高熵合金涂层强韧化机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

CeO2催化CO2和甲醇合成碳酸二甲酯构效关系和反应机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

聚合物/富勒烯太阳能电池光活性层形貌的超分子调控及稳定化

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员