We propose a novel and effective input transformation based adversarial defense method against gray- and black-box attack, which is computationally efficient and does not require any adversarial training or retraining of a classification model. We first show that a very simple iterative Gaussian smoothing can effectively wash out adversarial noise and achieve substantially high robust accuracy. Based on the observation, we propose Self-Supervised Iterative Contextual Smoothing (SSICS), which aims to reconstruct the original discriminative features from the Gaussian-smoothed image in context-adaptive manner, while still smoothing out the adversarial noise. From the experiments on ImageNet, we show that our SSICS achieves both high standard accuracy and very competitive robust accuracy for the gray- and black-box attacks; e.g., transfer-based PGD-attack and score-based attack. A note-worthy point to stress is that our defense is free of computationally expensive adversarial training, yet, can approach its robust accuracy via input transformation.
翻译:我们建议一种基于新颖而有效的投入转换方法,以对抗灰色和黑盒攻击,这种方法在计算上效率很高,不需要任何对抗性培训或对分类模型的再培训。我们首先表明,非常简单的迭代高斯平滑能够有效冲洗对抗性噪音,并实现相当强的准确性。根据观察,我们建议采用自我强化的自动超常环境滑动(SSICS),其目的是以适应环境的方式重建高山摩擦图像的原有歧视特征,同时保持对抗性噪音的平滑。我们从图像网络实验中可以看出,我们的SSICS在灰色和黑盒攻击方面既达到高标准精度,又具有非常有竞争力的强度精确性;例如,基于转移的PGD-攻击和基于分数的攻击。一个值得注意的强调点是,我们的国防没有计算昂贵的对抗性训练,但是,可以通过输入转换来接近其稳健的准确性。