Adversarial patch attack is a family of attack algorithms that perturb a part of image to fool a deep neural network model. Existing patch attacks mostly consider injecting adversarial patches at input-agnostic locations: either a predefined location or a random location. This attack setup may be sufficient for attack but has considerable limitations when using it for adversarial training. Thus, robust models trained with existing patch attacks cannot effectively defend other adversarial attacks. In this paper, we first propose an end-to-end patch attack algorithm, Generative Dynamic Patch Attack (GDPA), which generates both patch pattern and patch location adversarially for each input image. We show that GDPA is a generic attack framework that can produce dynamic/static and visible/invisible patches with a few configuration changes. Secondly, GDPA can be readily integrated for adversarial training to improve model robustness to various adversarial attacks. Extensive experiments on VGGFace, Traffic Sign and ImageNet show that GDPA achieves higher attack success rates than state-of-the-art patch attacks, while adversarially trained model with GDPA demonstrates superior robustness to adversarial patch attacks than competing methods. Our source code can be found at https://github.com/lxuniverse/gdpa.
翻译:Adversarial Adversarial 补丁攻击是一种攻击算法,它干扰了图像的一部分,以愚弄深神经网络模型。现有的补丁攻击主要考虑在输入-不可知地点 -- -- 预设地点或随机地点 -- -- 注射对抗性补丁:这种攻击设置可能足以进行攻击,但在进行对抗性训练时有相当大的限制。因此,用现有补丁攻击训练的强健模型无法有效捍卫其他对抗性攻击。在本文中,我们首先提议采用一个端到端的补丁攻击算法,即Generalive Dive Patch Action(GENA),它产生补丁模式和补丁位置,对每种输入图像进行对抗性攻击。我们显示,GDPA是一个通用攻击框架,能够产生动态/静态和可见/看不见的补丁补丁,但有一些配置上的改动。第二,国安组可以随时纳入对抗性训练,以提高各种对抗性攻击的模型的稳健性。在VGGGFace、交通信号和图像网上进行广泛的实验,表明,攻击的成功率式攻击率比State-o-art的补丁攻击率模型要高,而与GDPAabrb/gnexpractypractactactypractactactactal commacts codes。