Deep neural networks are vulnerable to adversarial examples, which are crafted by adding small, human-imperceptible perturbations to the original images, but make the model output inaccurate predictions. Before deep neural networks are deployed, adversarial attacks can thus be an important method to evaluate and select robust models in safety-critical applications. However, under the challenging black-box setting, the attack success rate, i.e., the transferability of adversarial examples, still needs to be improved. Based on image augmentation methods, we found that random transformation of image brightness can eliminate overfitting in the generation of adversarial examples and improve their transferability. To this end, we propose an adversarial example generation method based on this phenomenon, which can be integrated with Fast Gradient Sign Method (FGSM)-related methods to build a more robust gradient-based attack and generate adversarial examples with better transferability. Extensive experiments on the ImageNet dataset demonstrate the method's effectiveness. Whether on normally or adversarially trained networks, our method has a higher success rate for black-box attacks than other attack methods based on data augmentation. We hope that this method can help to evaluate and improve the robustness of models.
翻译:深心神经网络容易受到对抗性例子的伤害,这些例子是通过在原始图像中添加小的、人类无法察觉的干扰而设计的,但使模型输出的预测不准确。因此,在部署深心神经网络之前,对抗性攻击可以成为评估和选择安全关键应用中强健模型的重要方法。然而,在具有挑战性的黑盒设置下,攻击成功率,即对抗性例子的可转移性仍有待改进。根据图像增强方法,我们发现图像亮度的随机转换可以消除生成对抗性实例时的过度匹配,并提高其可转移性。为此,我们提议了一种基于这一现象的对抗性生成方法,可以与快速梯度信号法(FGSM)相关方法相结合,以构建更强大的梯度攻击,并生成更具有可转移性的对抗性的例子。关于图像网络数据集的广泛实验证明了该方法的有效性。无论是在通常的或敌对性训练的网络上,我们的方法可以消除黑箱攻击的成功率高于基于数据增强性的其他攻击方法。我们希望这种方法能够有助于评估并改进数据增强性。