Existing neural networks for computer vision tasks are vulnerable to adversarial attacks: adding imperceptible perturbations to the input images can fool these methods to make a false prediction on an image that was correctly predicted without the perturbation. Various defense methods have proposed image-to-image mapping methods, either including these perturbations in the training process or removing them in a preprocessing denoising step. In doing so, existing methods often ignore that the natural RGB images in today's datasets are not captured but, in fact, recovered from RAW color filter array captures that are subject to various degradations in the capture. In this work, we exploit this RAW data distribution as an empirical prior for adversarial defense. Specifically, we proposed a model-agnostic adversarial defensive method, which maps the input RGB images to Bayer RAW space and back to output RGB using a learned camera image signal processing (ISP) pipeline to eliminate potential adversarial patterns. The proposed method acts as an off-the-shelf preprocessing module and, unlike model-specific adversarial training methods, does not require adversarial images to train. As a result, the method generalizes to unseen tasks without additional retraining. Experiments on large-scale datasets (e.g., ImageNet, COCO) for different vision tasks (e.g., classification, semantic segmentation, object detection) validate that the method significantly outperforms existing methods across task domains.
翻译:计算机视觉任务的现有神经网络很容易受到对抗性攻击:在输入图像中添加不易察觉到的扰动图象,可能愚弄这些方法,对未经扰动而正确预测的图像作出虚假预测。各种防御方法都提出了图像到图像映射方法,要么将这些扰动纳入培训过程,要么在预处理分解步骤中去除这些图象。在这样做时,现有方法往往忽视今天数据集中自然的 RGB 图像没有被捕获,而事实上是从受捕获中各种退化影响的RAW 色彩过滤阵列捕获中回收出来的。在这项工作中,我们利用RAW 数据分布作为对抗性防御之前的经验。具体地说,我们提出了一种示范性、不可辨识性防御性防御性防御方法,将RGB 图像输入到Bayer RAW 空间,然后返回输出RGB, 使用一个有学识的相机图像图像信号处理(ISP) 管道消除潜在的对抗性对称模式。拟议方法作为离子处理前模块,与模型性对立性对立性目标训练方法不同,因此不需要进行大规模对立式的对立式的对立性任务。