Adversarial perturbations are critical for certifying the robustness of deep learning models. A universal adversarial perturbation (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an image-wise attack algorithm. However, the existing UAP generator is underdeveloped when images are drawn from different image sources (e.g., with different image resolutions). Towards an authentic universality across image sources, we take a novel view of UAP generation as a customized instance of few-shot learning, which leverages bilevel optimization and learning-to-optimize (L2O) techniques for UAP generation with improved attack success rate (ASR). We begin by considering the popular model agnostic meta-learning (MAML) framework to meta-learn a UAP generator. However, we see that the MAML framework does not directly offer the universal attack across image sources, requiring us to integrate it with another meta-learning framework of L2O. The resulting scheme for meta-learning a UAP generator (i) has better performance (50% higher ASR) than baselines such as Projected Gradient Descent, (ii) has better performance (37% faster) than the vanilla L2O and MAML frameworks (when applicable), and (iii) is able to simultaneously handle UAP generation for different victim models and image data sources.
翻译:反向扰动对于证明深层学习模式的稳健性至关重要。 通用对抗性扰动(UAP)可以同时攻击多个图像,从而提供一个更统一的威胁模型,避免图像攻击算法。 但是,当图像从不同图像来源(例如,图像分辨率不同)中提取图像时,现有的 UAP 生成器还不发达。 实现图像来源之间的真正普遍性,我们对UAP 生成器的新看法是个定制的微小学习实例,它利用双级优化和学习至优化(L2O)技术来利用双级优化和学习至优化(L2O)技术来提高攻击成功率(ASR),从而提供了一个更统一的威胁模型,从而避免了对图像源(MAML)的元化元性学习框架。 然而,我们看到,MAML框架并不直接提供图像来源之间的普遍攻击,要求我们将其与另一个L2O的元学习框架结合起来。 由此产生的UAP 元学习生成器的元性能(50%以上ASR)高于预测的基线性(50%),而可同时生成的图像(MA底)框架(可理解的图像(50%)比基准性)更快。