Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models. Therefore, these attacks are restricted by the availability of an effective surrogate model. In this work, we relax this assumption and propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch under the condition of no labels and few data samples. Our training approach is based on a min-max scheme which reduces overfitting via an adversarial objective and thus optimizes for a more generalizable surrogate model. Our proposed attack is complimentary to the adversarial pixel restoration and is independent of any task specific objective as it can be launched in a self-supervised manner. We successfully demonstrate the adversarial transferability of our approach to Vision Transformers as well as Convolutional Neural Networks for the tasks of classification, object detection, and video segmentation. Our training approach improves the transferability of the baseline unsupervised training method by 16.4% on ImageNet val. set. Our codes & pre-trained surrogate models are available at: https://github.com/HashmatShadab/APR
翻译:在这项工作中,我们放松这一假设,并提议将Adversarial像素恢复作为一种自监督的替代方案,以便在没有标签和少量数据样本的情况下从零开始训练有效的替代模型。我们的培训方法基于一个微轴计划,通过对抗性目标降低超配率,从而优化一个更通用的替代模型。我们提议的攻击与对抗性像素修复是相辅相成的,并且独立于任何特定任务目标,因为它可以以自我监督的方式启动。我们成功地展示了我们对愿景变异器以及用于分类、物体探测和视频分割任务的革命神经网络的对抗性转移性。我们的培训方法以一个微轴计划为基础,通过对抗性目标降低超配,从而优化一个更通用的替代模型。我们提议的攻击与对抗性像素修复是相辅相成的,并且独立于任何特定任务目标,因为可以自行监督地启动。我们成功地展示了我们对于愿景变异器以及用于分类、物体探测和视频分割任务的革命神经网络的方法。我们的训练方法使基准非超控制的训练方法的可转让性在图像网价上提高16.4%的可操作性。设置了我们的代码和预先训练后方位模型。