Despite that leveraging the transferability of adversarial examples can attain a fairly high attack success rate for non-targeted attacks, it does not work well in targeted attacks since the gradient directions from a source image to a targeted class are usually different in different DNNs. To increase the transferability of target attacks, recent studies make efforts in aligning the feature of the generated adversarial example with the feature distributions of the targeted class learned from an auxiliary network or a generative adversarial network. However, these works assume that the training dataset is available and require a lot of time to train networks, which makes it hard to apply to real-world scenarios. In this paper, we revisit adversarial examples with targeted transferability from the perspective of universality and find that highly universal adversarial perturbations tend to be more transferable. Based on this observation, we propose the Locality of Images (LI) attack to improve targeted transferability. Specifically, instead of using the classification loss only, LI introduces a feature similarity loss between intermediate features from adversarial perturbed original images and randomly cropped images, which makes the features from adversarial perturbations to be more dominant than that of benign images, hence improving targeted transferability. Through incorporating locality of images into optimizing perturbations, the LI attack emphasizes that targeted perturbations should be universal to diverse input patterns, even local image patches. Extensive experiments demonstrate that LI can achieve high success rates for transfer-based targeted attacks. On attacking the ImageNet-compatible dataset, LI yields an improvement of 12\% compared with existing state-of-the-art methods.
翻译:尽管利用对抗性实例的可转移性可以达到相当高的非目标攻击攻击成功率,但在定向攻击中效果不佳,因为不同DNNs中,从源图像到目标类别的梯度方向通常不同。为了提高目标攻击的可转移性,最近的研究努力将生成的对抗性例子的特点与从辅助网络或基因化对抗网络学到的目标类别特征分布相匹配。然而,这些工程假设培训数据集存在,需要大量时间来培训网络,因此难以应用于现实世界情景。在本文件中,我们从普遍性角度重新审视具有目标可转移性的对抗性例子,发现高度普遍的对抗性扰动性往往更容易可转移。基于这一观察,我们建议图像(LI)攻击的所在地性,以提高目标转移性。具体地说,除了使用分类损失之外,LI还引入了从对抗性原始图像和随机收成的图像之间的相似性损失,这使得从对抗性透性图像到现实世界情景的可移植性改进。因此,目标目标目标目标目标图像的可移动性图像的可移植性,可以比目标目标目标性图像的升级性更高。