Backdoor attacks have been considered a severe security threat to deep learning. Such attacks can make models perform abnormally on inputs with predefined triggers and still retain state-of-the-art performance on clean data. While backdoor attacks have been thoroughly investigated in the image domain from both attackers' and defenders' sides, an analysis in the frequency domain has been missing thus far. This paper first revisits existing backdoor triggers from a frequency perspective and performs a comprehensive analysis. Our results show that many current backdoor attacks exhibit severe high-frequency artifacts, which persist across different datasets and resolutions. We further demonstrate these high-frequency artifacts enable a simple way to detect existing backdoor triggers at a detection rate of 98.50% without prior knowledge of the attack details and the target model. Acknowledging previous attacks' weaknesses, we propose a practical way to create smooth backdoor triggers without high-frequency artifacts and study their detectability. We show that existing defense works can benefit by incorporating these smooth triggers into their design consideration. Moreover, we show that the detector tuned over stronger smooth triggers can generalize well to unseen weak smooth triggers. In short, our work emphasizes the importance of considering frequency analysis when designing both backdoor attacks and defenses in deep learning.
翻译:幕后攻击被认为是对深层学习的严重安全威胁。 这种攻击可以使模型以预设触发器对投入进行异常操作,并且仍然保留清洁数据的最新性能。 虽然从攻击者和捍卫者双方的图像领域都对幕后攻击进行了彻底调查, 但至今尚未对频率领域进行分析。 本文首先从频率角度重新审视现有的幕后攻击触发器, 并进行全面分析。 我们的结果表明, 许多当前的幕后攻击都表现出严重的高频文物, 存在于不同的数据集和分辨率之间。 我们进一步证明, 这些高频文物能够以一种简单的方式检测现有的幕后引爆器, 以98. 50 % 的探测率检测率, 而不事先了解攻击的细节和目标模式。 承认以前的攻击的弱点, 我们提出了一种实际的方法, 来创建光滑的幕后引爆器, 而不考虑高频装置, 并研究其可探测性。 我们表明, 将这些平稳的触发器纳入设计考虑, 将有利于现有的防御工程。 此外, 我们表明, 探测器调整更强的平稳引爆器能够将看不见的平稳触发器推广到更弱的触发器。 。 简洁, 在设计防御攻击时, 我们的深处强调频率分析的重要性。