Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models. Therefore, these attacks are restricted by the availability of an effective surrogate model. In this work, we relax this assumption and propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch under the condition of no labels and few data samples. Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective and thus optimizes for a more generalizable surrogate model. Our proposed attack is complimentary to our adversarial pixel restoration and is independent of any task specific objective as it can be launched in a self-supervised manner. We successfully demonstrate the adversarial transferability of our approach to Vision Transformers as well as Convolutional Neural Networks for the tasks of classification, object detection, and video segmentation. Our codes & pre-trained surrogate models are available at: https://github.com/HashmatShadab/APR
翻译:在这项工作中,我们放松这一假设,并提议Adversarial像素复原作为自我监督的替代方案,以便在没有标签和少量数据样本的条件下从零开始训练有效的替代模型。我们的培训方法基于一个微轴目标,即通过对抗性目标减少过度装配,从而优化一个更通用的替代模型。我们提议的攻击是对我们的对抗性像素复原的补充,独立于任何特定任务目标,因为它可以以自我监督的方式发射。我们成功地展示了我们向愿景变异器以及用于分类、物体探测和视频分割任务的革命神经网络的对抗性转移。我们的代码和预先训练的代孕模型见于:https://github.com/HashmatShadab/APR。