Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if the hidden backdoor is activated by attacker-specified triggers. This threat could happen when the training process is not fully controlled, such as training on third-party datasets or adopting third-party models, which poses a new and realistic threat. Although backdoor learning is an emerging and rapidly growing research area, its systematic review, however, remains blank. In this paper, we present the first comprehensive survey of this realm. We summarize and categorize existing backdoor attacks and defenses based on their characteristics, and provide a unified framework for analyzing poisoning-based backdoor attacks. Besides, we also analyze the relation between backdoor attacks and relevant fields ($i.e.,$ adversarial attacks and data poisoning), and summarize widely adopted benchmark datasets. Finally, we briefly outline certain future research directions relying upon reviewed works. A curated list of backdoor-related resources is also available at \url{https://github.com/THUYimingLi/backdoor-learning-resources}.
翻译:后门攻击意图将隐藏的后门攻击嵌入深层神经网络(DNNS),以便被攻击的模型在良性样本上表现良好,而如果隐藏的后门受到攻击者的触发,它们的预测就会恶意地改变。这种威胁可能发生在培训过程没有完全控制的情况下,例如第三方数据集培训或采用第三方模型,这构成了新的现实的威胁。虽然后门学习是一个新兴和迅速增长的研究领域,但其系统审查仍然是空白的。在本文中,我们介绍该领域的第一次全面调查。我们根据现有后门攻击和防御的特点对现有的后门攻击和防御进行总结和分类,并为分析基于中毒的后门攻击提供一个统一框架。此外,我们还分析后门攻击和相关领域之间的关系(即,美元对抗性攻击和数据中毒),并总结广泛采用的基准数据集。最后,我们简要概述了依赖经过审查的工程的某些未来研究方向。在\url{https://github.com/THUYIMResourL}后门相关资源缩列表也可在\urlreslegress.