In recent years, the security of deep learning models achieves more and more attentions with the rapid development of neural networks, which are vulnerable to adversarial examples. Almost all existing gradient-based attack methods use the sign function in the generation to meet the requirement of perturbation budget on $L_\infty$ norm. However, we find that the sign function may be improper for generating adversarial examples since it modifies the exact gradient direction. Instead of using the sign function, we propose to directly utilize the exact gradient direction with a scaling factor for generating adversarial perturbations, which improves the attack success rates of adversarial examples even with fewer perturbations. At the same time, we also theoretically prove that this method can achieve better black-box transferability. Moreover, considering that the best scaling factor varies across different images, we propose an adaptive scaling factor generator to seek an appropriate scaling factor for each image, which avoids the computational cost for manually searching the scaling factor. Our method can be integrated with almost all existing gradient-based attack methods to further improve their attack success rates. Extensive experiments on the CIFAR10 and ImageNet datasets show that our method exhibits higher transferability and outperforms the state-of-the-art methods.
翻译:近些年来,随着神经网络的迅速发展,深学习模型的安全性随着神经网络的迅速发展而得到越来越多的关注,这些神经网络很容易受到对抗性例子的影响。几乎所有现有的基于梯度的攻击方法几乎都使用生成过程中的标志功能来满足对美元标准值的扭曲预算要求。然而,我们认为,标志功能可能不适合生成对抗性实例,因为它改变了确切的梯度方向。我们提议直接使用精确的梯度方向,并使用一个缩放因子来生成对抗性扰动,从而提高对抗性例子的攻击成功率,甚至减少扰动率。与此同时,我们还理论上证明这种方法可以实现更好的黑箱可转移性。此外,考虑到最佳的缩放因子在不同图像之间各不相同,我们建议采用一个适应性缩放因子来为每个图像寻找适当的缩放因子,以避免人工搜索缩放因子的计算成本。我们的方法可以与几乎所有现有的基于梯度的攻击方法结合起来,以进一步提高其攻击成功率。在CIFAR10和图像网数据配置方面进行广泛的实验,显示我们的方法可以更高程度。