第三次DIHARD挑战的角逐 (Domain-Dependent Speaker Diarization for the Third DIHARD Challenge) - 专知论文

会员服务 ·

0

Performer · 层次聚类 · Better · Integration · SimPLe ·

2021 年 1 月 25 日

Domain-Dependent Speaker Diarization for the Third DIHARD Challenge

翻译：第三次DIHARD挑战的角逐

A Kishore Kumar,Shefali Waldekar,Goutam Saha,Md Sahidullah

from arxiv, This work was presented in The Third DIHARD Speech Diarization Challenge Workshop

This report presents the system developed by the ABSP Laboratory team for the third DIHARD speech diarization challenge. Our main contribution in this work is to develop a simple and efficient solution for acoustic domain dependent speech diarization. We explore speaker embeddings for \emph{acoustic domain identification} (ADI) task. Our study reveals that i-vector based method achieves considerably better performance than x-vector based approach in the third DIHARD challenge dataset. Next, we integrate the ADI module with the diarization framework. The performance substantially improved over that of the baseline when we optimized the thresholds for agglomerative hierarchical clustering and the parameters for dimensionality reduction during scoring for individual acoustic domains. We achieved a relative improvement of $9.63\%$ and $10.64\%$ in DER for core and full conditions, respectively, for Track 1 of the DIHARD III evaluation set.

翻译：本报告介绍了ABSP实验室小组为第三次DIHARD言辞分化挑战开发的系统。我们在这方面的主要贡献是开发一个简单有效的声域依赖言语分化解决方案。我们探索了用于\ emph{ 声域域识别} (ADI) 任务的演讲者嵌入器。我们的研究显示,基于i-矢量法在第三次DIHARD挑战数据集中取得了比x-矢量法更好的性能。接下来,我们将ADI模块与分化框架结合起来。当我们优化了集聚性等级组合的阈值和在个人声域评分期间减少维度的参数时,业绩大大高于基线。我们分别为DHARD III 评估组的轨道1的核心条件和全部条件相对改进了9.63美元和10.64美元。

0

相关内容

Performer

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

CCF推荐 | 国际会议信息8条

CCF推荐 | 国际会议信息8条

Call4Papers

9+阅读 · 2019年5月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

AI Challenger 2017 奇遇记

AI Challenger 2017 奇遇记

AINLP

5+阅读 · 2018年6月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

2017 VQA Challenge 第一名技术报告

2017 VQA Challenge 第一名技术报告

PaperWeekly

7+阅读 · 2017年9月26日

USTC-NELSLIP System Description for DIHARD-III Challenge

Arxiv

0+阅读 · 2021年3月19日

Dynamic Transfer for Multi-Source Domain Adaptation

Arxiv

1+阅读 · 2021年3月19日

A Critical Assessment of State-of-the-Art in Entity Alignment

Arxiv

0+阅读 · 2021年3月17日

Towards a Systematic Engineering of Industrial Domain-Specific Language

Arxiv

0+阅读 · 2021年3月17日

UniParma @ SemEval 2021 Task 5: Toxic Spans Detection Using CharacterBERT and Bag-of-Words Model

Arxiv

0+阅读 · 2021年3月17日

Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Arxiv

0+阅读 · 2021年3月17日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年2月2日

DDTCDR: Deep Dual Transfer Cross Domain Recommendation

Arxiv

5+阅读 · 2019年10月11日

Efficient Parameter-free Clustering Using First Neighbor Relations

Efficient Parameter-free Clustering Using First Neighbor Relations

Arxiv

7+阅读 · 2019年2月28日

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年4月12日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

最新，DeepSeek-R1论文登上Nature封面，附83页补充材料

人工智能与未来战争

自动驾驶中的轨迹预测大型基础模型：全面综述

万字长文《对抗雷达系统的电子战综述》

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

CCF推荐 | 国际会议信息8条

CCF推荐 | 国际会议信息8条

Call4Papers

9+阅读 · 2019年5月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

AI Challenger 2017 奇遇记

AI Challenger 2017 奇遇记

AINLP

5+阅读 · 2018年6月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

2017 VQA Challenge 第一名技术报告

2017 VQA Challenge 第一名技术报告

PaperWeekly

7+阅读 · 2017年9月26日

相关论文

USTC-NELSLIP System Description for DIHARD-III Challenge

Arxiv

0+阅读 · 2021年3月19日

Dynamic Transfer for Multi-Source Domain Adaptation

Arxiv

1+阅读 · 2021年3月19日

A Critical Assessment of State-of-the-Art in Entity Alignment

Arxiv

0+阅读 · 2021年3月17日

Towards a Systematic Engineering of Industrial Domain-Specific Language

Arxiv

0+阅读 · 2021年3月17日

UniParma @ SemEval 2021 Task 5: Toxic Spans Detection Using CharacterBERT and Bag-of-Words Model

Arxiv

0+阅读 · 2021年3月17日

Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Arxiv

0+阅读 · 2021年3月17日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年2月2日

DDTCDR: Deep Dual Transfer Cross Domain Recommendation

Arxiv

5+阅读 · 2019年10月11日

Efficient Parameter-free Clustering Using First Neighbor Relations

Efficient Parameter-free Clustering Using First Neighbor Relations

Arxiv

7+阅读 · 2019年2月28日

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年4月12日

微信扫码咨询专知VIP会员