Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of the infected model will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger, while it performs well on benign samples. Currently, most of existing backdoor attacks adopted the setting of \emph{static} trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing the characteristics of the static trigger. We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. We further explore how to utilize this property for backdoor defense, and discuss how to alleviate such vulnerability of existing attacks.
翻译:后门攻击意在将隐藏的后门注入深层神经网络(DNNs ), 这样, 如果隐藏的后门被攻击者定义的触发器触发, 而隐藏的后门攻击在良性样本上效果良好, 则对受感染模式的预测将会恶意地改变。 目前, 大部分现有的后门攻击都采用了 emph{ stistic} 触发器的设置, 即, $( $) 。 培训和测试图象的触发器都以同样的外观出现, 并且位于同一区域。 在本文中, 我们通过分析静态触发器的特性来重新审视这一攻击模式。 我们证明, 当测试图像中的触发器与用于训练的触发器不相符时, 这种攻击模式是脆弱的。 我们进一步探索如何利用这些特性进行后门防御, 并讨论如何减轻现有攻击的这种脆弱性 。