承认议长的承认中出现试验错误的原则解决办法 (A Principle Solution for Enroll-Test Mismatch in Speaker Recognition) - 专知论文

会员服务 ·

0

声纹识别 · Principle · 统计量 · Performer · 得分 ·

2021 年 11 月 24 日

A Principle Solution for Enroll-Test Mismatch in Speaker Recognition

翻译：承认议长的承认中出现试验错误的原则解决办法

Lantian Li,Dong Wang,Jiawen Kang,Renyu Wang,Jing Wu,Zhendong Gao,Xiao Chen

Mismatch between enrollment and test conditions causes serious performance degradation on speaker recognition systems. This paper presents a statistics decomposition (SD) approach to solve this problem. This approach decomposes the PLDA score into three components that corresponding to enrollment, prediction and normalization respectively. Given that correct statistics are used in each component, the resultant score is theoretically optimal. A comprehensive experimental study was conducted on three datasets with different types of mismatch: (1) physical channel mismatch, (2) speaking behavior mismatch, (3) near-far recording mismatch. The results demonstrated that the proposed SD approach is highly effective, and outperforms the ad-hoc multi-condition training approach that is commonly adopted but not optimal in theory.

翻译：校考和测试条件的错配导致扬声器识别系统出现严重性能退化。本文件展示了解决这一问题的统计分解(SD)方法。这一方法将PLDA分数分成了分别与招生、预测和正常化相对应的三个部分。鉴于每个部分使用正确的统计数据,由此得出的分数在理论上是最佳的。对三种不同类型不匹配的数据集进行了全面实验研究:(1) 物理频道不匹配,(2) 言语行为不匹配,(3) 近距离记录不匹配。结果显示,拟议的SD方法非常有效,超过了通常采用但理论上不理想的特设多条件培训方法。

0

相关内容

声纹识别

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

【2021新书】金融机器学习，192页pdf

专知会员服务

232+阅读 · 2021年6月3日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

【最受欢迎的概率书】《概率论：理论与实例》，490页pdf

【最受欢迎的概率书】《概率论：理论与实例》，490页pdf

专知会员服务

172+阅读 · 2020年11月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

已删除

将门创投

3+阅读 · 2018年8月21日

Neural-FST Class Language Model for End-to-End Speech Recognition

Neural-FST Class Language Model for End-to-End Speech Recognition

Arxiv

0+阅读 · 2022年1月31日

Neural Representations for Modeling Variation in Speech

Arxiv

0+阅读 · 2022年1月26日

A Survey of Unsupervised Domain Adaptation for Visual Recognition

Arxiv

9+阅读 · 2021年12月13日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition

Arxiv

3+阅读 · 2021年5月14日

Exploring RNN-Transducer for Chinese Speech Recognition

Arxiv

4+阅读 · 2019年4月23日

Speaker Recognition from raw waveform with SincNet

Arxiv

6+阅读 · 2018年7月29日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-Free Approach

Arxiv

9+阅读 · 2018年1月3日

VIP会员

文章信息

相关主题

相关VIP内容

【2021新书】金融机器学习，192页pdf

专知会员服务

232+阅读 · 2021年6月3日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

【最受欢迎的概率书】《概率论：理论与实例》，490页pdf

【最受欢迎的概率书】《概率论：理论与实例》，490页pdf

专知会员服务

172+阅读 · 2020年11月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

已删除

将门创投

3+阅读 · 2018年8月21日

相关论文

Neural-FST Class Language Model for End-to-End Speech Recognition

Neural-FST Class Language Model for End-to-End Speech Recognition

Arxiv

0+阅读 · 2022年1月31日

Neural Representations for Modeling Variation in Speech

Arxiv

0+阅读 · 2022年1月26日

A Survey of Unsupervised Domain Adaptation for Visual Recognition

Arxiv

9+阅读 · 2021年12月13日

A Survey on Neural Speech Synthesis

Arxiv

14+阅读 · 2021年6月30日

Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition

Arxiv

3+阅读 · 2021年5月14日

Exploring RNN-Transducer for Chinese Speech Recognition

Arxiv

4+阅读 · 2019年4月23日

Speaker Recognition from raw waveform with SincNet

Arxiv

6+阅读 · 2018年7月29日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-Free Approach

Arxiv

9+阅读 · 2018年1月3日

微信扫码咨询专知VIP会员