Backdoor attacks allow an attacker to embed functionality jeopardizing proper behavior of any algorithm, machine learning or not. This hidden functionality can remain inactive for normal use of the algorithm until activated by the attacker. Given how stealthy backdoor attacks are, consequences of these backdoors could be disastrous if such networks were to be deployed for applications as critical as border or access control. In this paper, we propose a novel backdoored network detection method based on the principle of anomaly detection, involving access to the clean part of the training data and the trained network. We highlight its promising potential when considering various triggers, locations and identity pairs, without the need to make any assumptions on the nature of the backdoor and its setup. We test our method on a novel dataset of backdoored networks and report detectability results with perfect scores.
翻译:幕后攻击可以让攻击者嵌入危害任何算法、机器学习或非算法正确行为的功能。 这种隐藏功能可以在攻击者启动之前对正常使用算法保持不活动状态。 鉴于隐形幕后攻击是如何进行的,如果这些后门攻击的后果是灾难性的,如果这些网络被部署用于边界或出入控制等至关重要的应用,则这些后门攻击的后果可能是灾难性的。 在本文中,我们提议基于异常探测原则的新颖的后门网络探测方法,包括访问培训数据和受过训练的网络的清洁部分。我们在考虑各种触发器、位置和身份配对时强调其有希望的潜力,而不必对后门的性质及其设置作出任何假设。我们用一种新颖的后门网络数据集测试我们的方法,并用完美的分数报告可探测性结果。