In this work we propose Energy Attack, a transfer-based black-box $L_\infty$-adversarial attack. The attack is parameter-free and does not require gradient approximation. In particular, we first obtain white-box adversarial perturbations of a surrogate model and divide these perturbations into small patches. Then we extract the unit component vectors and eigenvalues of these patches with principal component analysis (PCA). Base on the eigenvalues, we can model the energy distribution of adversarial perturbations. We then perform black-box attacks by sampling from the perturbation patches according to their energy distribution, and tiling the sampled patches to form a full-size adversarial perturbation. This can be done without the available access to victim models. Extensive experiments well demonstrate that the proposed Energy Attack achieves state-of-the-art performance in black-box attacks on various models and several datasets. Moreover, the extracted distribution is able to transfer among different model architectures and different datasets, and is therefore intrinsic to vision architectures.
翻译:在这项工作中,我们提出“能量攻击”,即基于传输的黑盒子的“L ⁇ infty$-对抗性攻击”,攻击是无参数的,不要求梯度近似值。特别是,我们首先获得替代模型的白盒子对角扰动,并将这些扰动分成小补丁。然后,我们用主要部件分析(PCA)提取这些补丁的单位构件矢量和元值。根据电子元值,我们可以模拟对角扰动的能量分布。然后,我们通过根据能量分布从扰动补丁采样进行黑盒攻击,并将抽样补丁打成一个全尺寸的对角扰动。这可以在没有受害者模型可用的情况下进行。广泛的实验充分证明,拟议的能源攻击在不同模型和数个数据集的黑盒子攻击中达到了最先进的性能。此外,提取的分布能够在不同模型结构和不同的数据集之间转移,因此是视觉结构的内在性能。