Convolutional neural networks (CNNs) have achieved beyond human-level accuracy in the image classification task and are widely deployed in real-world environments. However, CNNs show vulnerability to adversarial perturbations that are well-designed noises aiming to mislead the classification models. In order to defend against the adversarial perturbations, adversarially trained GAN (ATGAN) is proposed to improve the adversarial robustness generalization of the state-of-the-art CNNs trained by adversarial training. ATGAN incorporates adversarial training into standard GAN training procedure to remove obfuscated gradients which can lead to a false sense in defending against the adversarial perturbations and are commonly observed in existing GANs-based adversarial defense methods. Moreover, ATGAN adopts the image-to-image generator as data augmentation to increase the sample complexity needed for adversarial robustness generalization in adversarial training. Experimental results in MNIST SVHN and CIFAR-10 datasets show that the proposed method doesn't rely on obfuscated gradients and achieves better global adversarial robustness generalization performance than the adversarially trained state-of-the-art CNNs.
翻译:在图像分类任务中,有线电视新闻网超越了人类水平的精确度,在现实世界环境中广泛部署,但是,有线电视网显示很容易受到对抗性扰动,而这种扰动是设计周密的噪音,目的是误导分类模式;为了防范对抗性扰动,提议通过对抗性训练的GAN(ATGAN)改进通过对抗性训练培训培训的先进CNN的对抗性稳健性一般化。ATGAN将对抗性培训纳入标准GAN培训程序,以消除在防御对抗性扰动方面可能导致错误意识的迷惑性梯度,并在现有的以GANs为基础的对抗性防御方法中常见。此外,ATGAN采用图像到图像生成器作为数据增强数据增强对抗性稳健性一般化所需的样本复杂性。MNIST SVHN和CIFAR-10数据集的实验结果表明,拟议的方法并不依赖于被粉碎的梯度,而且比经过训练的全球对抗性对抗性对抗性稳健性通用性工作要好。