反向净化传播模型 (Diffusion Models for Adversarial Purification)

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure that uses diffusion models for adversarial purification: Given an adversarial example, we first diffuse it with a small amount of noise following a forward diffusion process, and then recover the clean image through a reverse generative process. To evaluate our method against strong adaptive attacks in an efficient and scalable way, we propose to use the adjoint method to compute full gradients of the reverse generative process. Extensive experiments on three image datasets including CIFAR-10, ImageNet and CelebA-HQ with three classifier architectures including ResNet, WideResNet and ViT demonstrate that our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods, often by a large margin. Project page: https://diffpure.github.io.

翻译：反面净化是指使用基因模型来消除对抗性扰动的防御方法。这些方法并不对攻击形式和分类模型进行假设,因此可以保护先前存在的分类者免受无形威胁。但是,它们的表现目前落后于对抗性培训方法。在这项工作中,我们提议DiffPure使用对抗性净化的传播模式:根据一个对抗性例子,我们首先在前方扩散过程之后以少量噪音来扩散它,然后通过反向基因化过程恢复干净的图像。为了以高效和可伸缩的方式评估我们对付强烈适应性攻击的方法,我们提议使用联合方法来计算反向基因过程的全部梯度。对三种图像数据集进行广泛的实验,包括CIFAR-10、图像网和CelibA-HQ, 有三个分类结构,包括ResNet、WideResNet和ViT, 表明我们的方法取得了最新的结果,优于目前的对抗性培训和对抗性净化方法,往往以大幅度进行。项目页: https://diffple.gibre.gibio.

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/