By adding human-imperceptible perturbations to images, DNNs can be easily fooled. As one of the mainstream methods, feature space targeted attacks perturb images by modulating their intermediate feature maps, for the discrepancy between the intermediate source and target features is minimized. However, the current choice of pixel-wise Euclidean Distance to measure the discrepancy is questionable because it unreasonably imposes a spatial-consistency constraint on the source and target features. Intuitively, an image can be categorized as "cat" no matter the cat is on the left or right of the image. To address this issue, we propose to measure this discrepancy using statistic alignment. Specifically, we design two novel approaches called Pair-wise Alignment Attack and Global-wise Alignment Attack, which attempt to measure similarities between feature maps by high-order statistics with translation invariance. Furthermore, we systematically analyze the layer-wise transferability with varied difficulties to obtain highly reliable attacks. Extensive experiments verify the effectiveness of our proposed method, and it outperforms the state-of-the-art algorithms by a large margin. Our code is publicly available at https://github.com/yaya-cheng/PAA-GAA.
翻译:通过在图像中添加人类无法察觉的扰动, DNNs 很容易被愚弄。 作为主流方法之一, 将空间目标攻击定位为通过调制中间地貌图, 将中间源和目标特征之间的差异降到最小; 然而, 目前选择像素- 源与 Euclidean 距离来测量差异是值得怀疑的, 因为它对源和目标特征施加了不合理的空间- 一致性限制。 直觉地说, 图像可以归类为“ 猫”, 猫在图像左侧或右侧, 为了解决这一问题, 我们建议使用统计校正来测量这一差异。 具体地说, 我们设计了两种新颖的方法, 叫做“ 双向对齐对齐攻击 ” 和 “ 全球方向对齐攻击 ”, 试图用高调统计数据来测量特征图之间的相似性, 并进行翻译不一变。 此外, 我们系统分析多层次的可转移性, 以获得非常可靠的攻击。 广义的实验可以验证我们所提议的方法的有效性, 并且它超越了大边缘的状态- 。 我们的代码可以公开在 http/ PasengA/ 。