对物理世界深层学习系统的后门攻击 (Backdoor Attacks Against Deep Learning Systems in the Physical World)

Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific trigger. Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that use digitally generated patterns as triggers. A critical question remains unanswered: can backdoor attacks succeed using physical objects as triggers, thus making them a credible threat against deep learning systems in the real world? We conduct a detailed empirical study to explore this question for facial recognition, a critical deep learning task. Using seven physical objects as triggers, we collect a custom dataset of 3205 images of ten volunteers and use it to study the feasibility of physical backdoor attacks under a variety of real-world conditions. Our study reveals two key findings. First, physical backdoor attacks can be highly successful if they are carefully configured to overcome the constraints imposed by physical objects. In particular, the placement of successful triggers is largely constrained by the target model's dependence on key facial features. Second, four of today's state-of-the-art defenses against (digital) backdoors are ineffective against physical backdoors, because the use of physical objects breaks core assumptions used to construct these defenses. Our study confirms that (physical) backdoor attacks are not a hypothetical phenomenon but rather pose a serious real-world threat to critical classification tasks. We need new and more robust defenses against backdoors in the physical world.

翻译：隐蔽的恶意行为将隐蔽的后门攻击嵌入深层学习模式,这些模式只能激活并导致包含特定触发器的模型输入错误分类。关于后门攻击和防御的现有工作,主要侧重于以数字生成的模式作为触发器的数字攻击。一个关键问题仍然没有得到回答:后门攻击能否成功地使用物理物体作为触发器,从而使它们成为现实世界深层学习系统的可信威胁?我们进行了详细的实证研究,以探讨面部识别问题,这是一项至关重要的深层学习任务。使用7个物理物体作为触发器,我们收集了3205个10名志愿者图像的定制数据集,并用这些数据研究实体后门攻击的可行性。然而,我们的研究揭示了两个关键结论。首先,如果对物理物体进行仔细配置以克服物理物体对物理物体的制约,那么后门攻击就会非常成功。特别是,成功触发触发这些触发装置在很大程度上受到目标模型对关键面面部特征的依赖。第二,我们当今的4个最先进的后门防御系统,对实体后门攻击的可行性是无效的,因为我们的后门后门攻击的后门研究,而不是真正的后门攻击的后门,因为我们的后门的后门的后门的后防是使用了真正的防御,因为物理物体的核心假设是新的,而不是真正的世界的后防,而使用了。