为多语种发言者的承认进行分解的代言学习 (Disentangled representation learning for multilingual speaker recognition)

The goal of this paper is to train speaker embeddings that are robust to bilingual speaking scenario. The majority of the world's population speak at least two languages; however, most speaker recognition systems fail to recognise the same speaker when speaking in different languages. Popular speaker recognition evaluation sets do not consider the bilingual scenario, making it difficult to analyse the effect of bilingual speakers on speaker recognition performance. This paper proposes a new large-scale evaluation set derived from VoxCeleb that considers bilingual scenarios. We also introduce a representation learning strategy, which disentangles language information from speaker representation to account for the bilingual scenario. This language-disentangled representation learning strategy can be adapted to existing models with small changes to the training pipeline. Experimental results demonstrate that the baseline models suffer significant performance degradation when evaluated on the proposed bilingual test set. On the contrary, the model trained with the proposed disentanglement strategy shows significant improvement under the bilingual evaluation scenario while simultaneously retaining competitive performance on existing monolingual test sets.

翻译：本文的目的是培训与双语演讲情景相适应的演讲者嵌入。世界人口的大多数人口至少讲两种语言;然而,大多数演讲者识别系统在以不同语言发言时都无法认出同一发言者。普通的演讲者识别评价组不考虑双语情景,因此难以分析双语演讲者对语音识别表现的影响。本文提出了一套新的大型评价组,由VoxCeleb提出,考虑到双语情景。我们还引入了一种代表学习战略,将语言信息与演讲者表述方式分开,以考虑双语情景。这种语言分解的教学战略可以适应现有模式,对培训流程稍作改动。实验结果表明,在对拟议的双语测试组进行评估时,基线模型的性能显著下降。相反,经过培训的与拟议混合战略的模型显示双语评估情景下显著改进,同时保留现有单一语言测试组的竞争性表现。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日