We address speaker-aware anti-spoofing, where prior knowledge of the target speaker is incorporated into a voice spoofing countermeasure (CM). In contrast to the frequently used speaker-independent solutions, we train the CM in a speaker-conditioned way. As a proof of concept, we consider speaker-aware extension to the state-of-the-art AASIST (audio anti-spoofing using integrated spectro-temporal graph attention networks) model. To this end, we consider two alternative strategies to incorporate target speaker information at the frame and utterance levels, respectively. The experimental results on a custom protocol based on ASVspoof 2019 dataset indicates the efficiency of the speaker information via enrollment: we obtain maximum relative improvements of 25.1% and 11.6% in equal error rate (EER) and minimum tandem detection cost function (t-DCF) over a speaker-independent baseline, respectively.
翻译:与常用的自发演讲者解决方案不同,我们用有声限制的方式对CM进行培训。作为概念的证明,我们考虑让AASIST(使用综合光谱-时钟图形关注网络的反声限制)模式的发言者意识到扩展。为此,我们考虑采用两种替代战略,分别将目标演讲者信息纳入框架和语调层面。基于ASVspoof 2019数据集的定制协议实验结果表明,通过注册,发言者信息的效率:我们获得最大程度的相对改进,分别为25.1%和11.6%的相同误差率和最小同步检测成本功能(t-DCF),高于一个独立演讲者基线。</s>