The performance of automatic speaker verification (ASV) systems could be degraded by voice spoofing attacks. Most existing works aimed to develop standalone spoofing countermeasure (CM) systems. Relatively little work targeted at developing an integrated spoofing aware speaker verification (SASV) system. In the recent SASV challenge, the organizers encourage the development of such integration by releasing official protocols and baselines. In this paper, we build a probabilistic framework for fusing the ASV and CM subsystem scores. We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework. Surprisingly, these strategies significantly improve the SASV equal error rate (EER) from 19.31% of the baseline to 1.53% on the official evaluation trials of the SASV challenge. We verify the effectiveness of our proposed components through ablation studies and provide insights with score distribution analysis.
翻译:自动扬声器核查系统(ASV)的性能可能因声波攻击而退化。大多数现有工作旨在开发独立的防波反制系统(CM),相对而言,旨在开发一个综合的防波声声器核查系统(SASV)的工作很少。在最近SASV的挑战中,组织者通过发布正式协议和基线鼓励发展这种一体化。在本文件中,我们为冻结ASV和CM子分数建立一个概率框架。我们进一步提出集成战略,以直接推断和微调为基础预测SASV分数。令人惊讶的是,这些战略大大改善了SASV的等差率,从基线的19.31 %提高到SASV挑战正式评价试验的1.53%。我们通过通货膨胀研究来核查我们拟议组成部分的有效性,并提供分数分布分析的见解。