Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot. Currently, such attack methods are based on the hypothesis that the substitute model and the victim's model learn similar decision boundaries, and they conventionally apply Sign Method (SM) to manipulate the gradient as the resultant perturbation. Although SM is efficient, it only extracts the sign of gradient units but ignores their value difference, which inevitably leads to a serious deviation. Therefore, we propose a novel Staircase Sign Method (S$^2$M) to alleviate this issue, thus boosting transfer-based attacks. Technically, our method heuristically divides the gradient sign into several segments according to the values of the gradient units, and then assigns each segment with a staircase weight for better crafting adversarial perturbation. As a result, our adversarial examples perform better in both white-box and black-box manner without being more visible. Since S$^2$M just manipulates the resultant gradient, our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible. Extensive experiments on the ImageNet dataset demonstrate the effectiveness of our proposed methods, which significantly improve the transferability (i.e., on average, \textbf{5.1\%} for normally trained models and \textbf{11.2\%} for adversarially trained defenses). Our code is available at: \url{https://github.com/qilong-zhang/Staircase-sign-method}.
翻译:转移式袭击的对抗性实例是富有挑战性的,并且仍然是一个研究热点。目前,这种攻击方法基于一种假设,即替代模型和受害者模型的模型学会类似的决定界限,它们通常使用Sign方法(SM)来操纵梯度,作为由此产生的扰动。虽然SM效率很高,但它只提取梯度单位的标志,而忽略其价值差异,这不可避免地导致严重的偏差。因此,我们提议了一个新型的Staircase Sign 方法(S%2${M)来缓解这一问题,从而加速转移式袭击。技术上,我们的方法将梯度信号按照梯度单位的值分为几个部分,然后将每个部分分配一个螺旋重量,用于更好地构造对抗性扰动。因此,我们的对抗性实例在白箱和黑盒两种方式上都表现得更好,而不会更明显地造成偏差。由于S%2M只是调整结果的梯度,我们的方法可以被普遍纳入任何转移式袭击中,而计算性间接地将梯度信号标记分为几个部分, 5 正常地在图像网络上测试我们。