VoxCeleb发言人承认挑战2022年皇家红水系统 (The Royalflush System for VoxCeleb Speaker Recognition Challenge 2022) - 专知论文

会员服务 ·

0

目标领域 · 验证集 · 声纹识别 · 簇 · 情景 ·

2022 年 9 月 19 日

The Royalflush System for VoxCeleb Speaker Recognition Challenge 2022

翻译：VoxCeleb发言人承认挑战2022年皇家红水系统

Jingguang Tian,Xinhui Hu,Xinkang Xu

In this technical report, we describe the Royalflush submissions for the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). Our submissions contain track 1, which is for supervised speaker verification and track 3, which is for semi-supervised speaker verification. For track 1, we develop a powerful U-Net-based speaker embedding extractor with a symmetric architecture. The proposed system achieves 2.06% in EER and 0.1293 in MinDCF on the validation set. Compared with the state-of-the-art ECAPA-TDNN, it obtains a relative improvement of 20.7% in EER and 22.70% in MinDCF. For track 3, we employ the joint training of source domain supervision and target domain self-supervision to get a speaker embedding extractor. The subsequent clustering process can obtain target domain pseudo-speaker labels. We adapt the speaker embedding extractor using all source and target domain data in a supervised manner, where it can fully leverage both domain information. Moreover, clustering and supervised domain adaptation can be repeated until the performance converges on the validation set. Our final submission is a fusion of 10 models and achieves 7.75% EER and 0.3517 MinDCF on the validation set.

翻译：在本技术报告中,我们描述了Royalflush提交VoxCeleb议长承认挑战2022(VoxSRC-22)。我们的呈件包含第1轨,用于监督演讲者核查,第3轨,用于半监督演讲者核查。第1轨,我们开发了强大的U-Net基演讲者嵌入提取器,并配有对称结构。拟议的系统在验证集中实现了2.06%的EER和0.1293的MinDCF。与最先进的ECAPA-TDNN相比,它相对改进了20.7%的EER和22.70%的MinDCF。关于第3轨,我们采用对源域监督的联合培训和目标域自我监督的联合培训,以获得发言者嵌入式。随后的组合进程可以获得目标域化伪发言人标签。我们用所有源和目标域数据对发言者嵌入的定位器进行了调整,从而能够充分利用域信息。此外,组合和监管的域适应可以重复到验证集集的性工作之前。我们的最后呈件是10个ER 0.175模型的MER 0.15和0.175的确认。

0

相关内容

目标领域

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

金属有机热载纳米流体的储能机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

三维非线性磁流体力学的自适应有限元方法

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

分段光滑Filippov系统的动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

三种瑞香中抗艾滋病病毒活性萜类成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

PM2.5声波团聚中尾流效应及颗粒破碎机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Fourier型标架与分形谱测度

国家自然科学基金

0+阅读 · 2012年12月31日

O2/CO2流化床燃烧机理及基于CFD-DEM反应流模型的研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米限域体系中杂多酸催化酮类Baeyer-Villiger氧化反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

形貌可控的磁性纳米粒子负载催化剂的制备及催化的有机合成反应

国家自然科学基金

0+阅读 · 2011年12月31日

Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction

Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction

Arxiv

0+阅读 · 2022年10月27日

Learning One-Class Hyperspectral Classifier from Positive and Unlabeled Data for Low Proportion Target

Arxiv

0+阅读 · 2022年10月27日

Large-scale learning of generalised representations for speaker recognition

Arxiv

0+阅读 · 2022年10月27日

Ranking Edges by their Impact on the Spectral Complexity of Information Diffusion over Networks

Arxiv

0+阅读 · 2022年10月27日

The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge

Arxiv

0+阅读 · 2022年10月26日

Magnitude-aware Probabilistic Speaker Embeddings

Arxiv

0+阅读 · 2022年10月23日

Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG

Arxiv

0+阅读 · 2022年10月23日

Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation

Arxiv

0+阅读 · 2022年10月23日

Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses

Arxiv

0+阅读 · 2022年10月20日

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Arxiv

15+阅读 · 2020年3月26日

VIP会员

文章信息

相关主题

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction

Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction

Arxiv

0+阅读 · 2022年10月27日

Learning One-Class Hyperspectral Classifier from Positive and Unlabeled Data for Low Proportion Target

Arxiv

0+阅读 · 2022年10月27日

Large-scale learning of generalised representations for speaker recognition

Arxiv

0+阅读 · 2022年10月27日

Ranking Edges by their Impact on the Spectral Complexity of Information Diffusion over Networks

Arxiv

0+阅读 · 2022年10月27日

The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge

Arxiv

0+阅读 · 2022年10月26日

Magnitude-aware Probabilistic Speaker Embeddings

Arxiv

0+阅读 · 2022年10月23日

Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG

Arxiv

0+阅读 · 2022年10月23日

Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation

Arxiv

0+阅读 · 2022年10月23日

Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses

Arxiv

0+阅读 · 2022年10月20日

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Arxiv

15+阅读 · 2020年3月26日

相关基金

金属有机热载纳米流体的储能机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

三维非线性磁流体力学的自适应有限元方法

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

分段光滑Filippov系统的动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

三种瑞香中抗艾滋病病毒活性萜类成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

PM2.5声波团聚中尾流效应及颗粒破碎机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Fourier型标架与分形谱测度

国家自然科学基金

0+阅读 · 2012年12月31日

O2/CO2流化床燃烧机理及基于CFD-DEM反应流模型的研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米限域体系中杂多酸催化酮类Baeyer-Villiger氧化反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

形貌可控的磁性纳米粒子负载催化剂的制备及催化的有机合成反应

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员