Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data ($e.g.$, data from the Internet or third-party data company). This raises the question of whether adopting untrusted third-party data can pose a security threat. In this paper, we demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. Specifically, we design a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers ($i.e.$, pre-defined utterances), based on our understanding of verification tasks. The infected models behave normally on benign samples, while attacker-specified unenrolled triggers will successfully pass the verification even if the attacker has no information about the enrolled speaker. We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification. Our approach not only provides a new perspective for designing novel attacks, but also serves as a strong baseline for improving the robustness of verification methods. The code for reproducing main results is available at \url{https://github.com/zhaitongqing233/Backdoor-attack-against-speaker-verification}.
翻译:在许多关键任务领域广泛和成功地采用了发言人核查方法,以便识别用户身份。对发言者进行核查的培训需要大量的数据,因此用户通常需要采用第三方数据(例如,美元,因特网或第三方数据公司的数据)。这就提出了采用不信任第三方数据是否会构成安全威胁的问题。在本文件中,我们证明有可能通过毒害培训数据,为攻击演讲者核查模式注入隐蔽的后门。具体地说,我们设计了一个基于集群的攻击计划,不同组群的有毒样品将包含不同的触发器(即,美元,预先定义的发音),这是基于我们对核查任务的理解。受感染的模型通常在良性样品上运作,而攻击者指定的未发音触发器即使攻击者没有关于已登记发言者的信息,也会成功通过核查。我们还表明,攻击演讲者核查不能直接采用现有的后门攻击。我们的方法不仅为设计新的攻击提供了新的视角,而且还作为改进核查方法稳健性的强有力基线。重新制作主要结果的代码可在以下查阅:http://qourqor-combar-complications。