As adversarial attacks against machine learning models have raised increasing concerns, many denoising-based defense approaches have been proposed. In this paper, we summarize and analyze the defense strategies in the form of symmetric transformation via data denoising and reconstruction (denoted as $F+$ inverse $F$, $F-IF$ Framework). In particular, we categorize these denoising strategies from three aspects (i.e. denoising in the spatial domain, frequency domain, and latent space, respectively). Typically, defense is performed on the entire adversarial example, both image and perturbation are modified, making it difficult to tell how it defends against the perturbations. To evaluate the robustness of these denoising strategies intuitively, we directly apply them to defend against adversarial noise itself (assuming we have obtained all of it), which saving us from sacrificing benign accuracy. Surprisingly, our experimental results show that even if most of the perturbations in each dimension is eliminated, it is still difficult to obtain satisfactory robustness. Based on the above findings and analyses, we propose the adaptive compression strategy for different frequency bands in the feature domain to improve the robustness. Our experiment results show that the adaptive compression strategies enable the model to better suppress adversarial perturbations, and improve robustness compared with existing denoising strategies.
翻译:由于对机器学习模式的对抗性攻击引起了越来越多的关注,许多以淡化为基础的防御方法已经提出。在本文件中,我们总结和分析国防战略,其形式是通过数据淡化和重建进行对称转换(意为F+美元对F美元,即F-IF美元框架),特别是,我们将这些取消战略从三个方面(即空间域、频率域和潜空空间的剥离)分类,通常,在整个对抗性例子中进行防御,改变形象和扰动,使其难以辨别如何抵御扰动。为了评价这些淡化战略的稳健性,我们直接运用这些战略来防御对抗对抗的噪音(假设我们已全部获得这些噪音),使我们不至于牺牲温和的准确性。令人惊讶的是,我们的实验结果表明,即使每个方面的大部分扰动性都消除了,但仍然难以获得令人满意的稳健和性。根据上述调查结果和分析,我们建议采用适应性压缩战略来改进现有稳健的升级战略,从而在不同的频谱中改进了我们的调整性,使调整性战略得以改进。