Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to inputs. Recently different attacks and strategies have been proposed, but how to generate adversarial examples perceptually realistic and more efficiently remains unsolved. This paper proposes a novel framework called Attack-Inspired GAN (AI-GAN), where a generator, a discriminator, and an attacker are trained jointly. Once trained, it can generate adversarial perturbations efficiently given input images and target classes. Through extensive experiments on several popular datasets \eg MNIST and CIFAR-10, AI-GAN achieves high attack success rates and reduces generation time significantly in various settings. Moreover, for the first time, AI-GAN successfully scales to complicated datasets \eg CIFAR-100 with around $90\%$ success rates among all classes.
翻译:深神经网络(DNNs)很容易遇到对抗性的例子,这些例子的形成是通过增加输入输入的不可察觉的干扰而形成的。最近提出了不同的攻击和战略,但如何产生看起来现实和更加有效的对抗性例子仍未解决。本文提议了一个称为攻击激励型GAN(AI-GAN)的新框架,在这个框架内,一个发电机、一个歧视者和攻击者共同接受培训。一旦经过培训,它就可以产生对抗性干扰,并有效地提供输入图像和目标课程。通过对几个流行数据集(例如MNIST和CIFAR-10)的广泛实验,AI-GAN取得了高攻击成功率,并大大减少了不同环境中的发电时间。此外,AI-GAN首次成功地将复杂的数据集规模(例如CIRA-100)扩大到所有类别中大约90美元的成功率。