In this work, we leverage visual prompting (VP) to improve adversarial robustness of a fixed, pre-trained model at testing time. Compared to conventional adversarial defenses, VP allows us to design universal (i.e., data-agnostic) input prompting templates, which have plug-and-play capabilities at testing time to achieve desired model performance without introducing much computation overhead. Although VP has been successfully applied to improving model generalization, it remains elusive whether and how it can be used to defend against adversarial attacks. We investigate this problem and show that the vanilla VP approach is not effective in adversarial defense since a universal input prompt lacks the capacity for robust learning against sample-specific adversarial perturbations. To circumvent it, we propose a new VP method, termed Class-wise Adversarial Visual Prompting (C-AVP), to generate class-wise visual prompts so as to not only leverage the strengths of ensemble prompts but also optimize their interrelations to improve model robustness. Our experiments show that C-AVP outperforms the conventional VP method, with 2.1X standard accuracy gain and 2X robust accuracy gain. Compared to classical test-time defenses, C-AVP also yields a 42X inference time speedup.
翻译:在这项工作中,我们利用视觉提示(VP)在测试时提高固定,预训练模型的对抗鲁棒性。与传统的对抗性防御相比,VP允许我们设计出通用的(即数据无关的)输入提示模板,这些模板具有即插即用的能力,以实现期望的模型性能,而不会引入太多的计算开销。尽管VP已经成功应用于提高模型的泛化能力,但它是否可以用于防御对抗攻击仍然模糊不清。我们调查了这个问题,并表明,由于通用输入提示缺乏针对样本特定的对抗扰动的稳健学习能力,因此传统的VP方法在对抗性防御方面并不有效。为了解决这个问题,我们提出了一种新的VP方法,称为类别对抗视觉提示(C-AVP),以生成分类别的视觉提示,使得不仅利用集成提示的优势,还优化它们之间的相互关系以提高模型的鲁棒性。我们的实验表明,C-AVP优于传统的VP方法,具有2.1倍的标准准确率增益和2倍的鲁棒准确率增益。与经典的测试时间防御相比,C-AVP还能提供42倍的推理时间加速。