To remove the effects of adversarial perturbations, preprocessing defenses such as pixel discretization are appealing due to their simplicity but have so far been shown to be ineffective except on simple datasets such as MNIST, leading to the belief that pixel discretization approaches are doomed to failure as a defense technique. This paper revisits the pixel discretization approaches. We hypothesize that the reason why existing approaches have failed is that they have used a fixed codebook for the entire dataset. In particular, we find that can lead to situations where images become more susceptible to adversarial perturbations and also suffer significant loss of accuracy after discretization. We propose a novel image preprocessing technique called Essential Features that uses an adaptive codebook that is based on per-image content and threat model. Essential Features adaptively selects a separable set of color clusters for each image to reduce the color space while preserving the pertinent features of the original image, maximizing both separability and representation of colors. Additionally, to limit the adversary's ability to influence the chosen color clusters, Essential Features takes advantage of spatial correlation with an adaptive blur that moves pixels closer to their original value without destroying original edge information. We design several adaptive attacks and find that our approach is more robust than previous baselines on $L_\infty$ and $L_2$ bounded attacks for several challenging datasets including CIFAR-10, GTSRB, RESISC45, and ImageNet.
翻译:为消除对抗性扰动的影响,像素离散等预处理防御,由于简单化而具有吸引力,但迄今为止,除了对像素离散等简单数据集的处理外,除了MNIST等简单数据集之外,已经证明是无效的,导致相信像素离散方法注定会失败,成为防御技术。本文重审像素离散方法。我们假设,现有方法失败的原因是它们对整个数据集使用了固定的代码簿。特别是,我们发现,这种情况可能导致图像更容易受到对抗性扰动,并且离散后也严重丧失准确性。我们提议了一种新型图像预处理技术,称为“基本特性”,它使用适应性代码簿,它以每个图像的像素离散方法为基础。我们假设现有方法之所以失败的原因是它们为每个图像保留了固定的颜色空间,同时保持了原始图像的相关特性,最大限度地使颜色的可辨性与表示性。 此外,限制敌人影响所选的颜色组合的能力,基本Fetatrial2, 在原始攻击中, 包括原始数据库 更接近于先前的磁带上,我们更接近于原始设计GRL 的图像基线的定位。