By adding human-imperceptible noise to clean images, the resultant adversarial examples can fool other unknown models. Features of a pixel extracted by deep neural networks (DNNs) are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. Motivated by this, we propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models, which differs from the existing attack methods manipulating pixel-wise noise. In this way, without sacrificing the performance of white-box attack, our adversarial examples can have strong transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $\epsilon$-constraint is properly assigned to its surrounding regions by a project kernel. Our method can be generally integrated to any gradient-based attack methods. Compared with the current state-of-the-art attacks, we significantly improve the success rate by 9.2\% for defense models and 3.7\% for normally trained models on average. Our code is available at \url{https://github.com/qilong-zhang/Patch-wise-iterative-attack}
翻译:通过在清洁图像中添加人类无法察觉的噪音,由此产生的对抗性实例可以愚弄其他未知模型。 深神经网络(DNNs)提取的像素的特性受到周围区域的影响, 而不同的DNNs通常侧重于不同的歧视区域。 受此驱动, 我们提出一种偏差的迭代算法 -- -- 黑箱攻击到通常经过训练的主流和防御模式, 这与操纵像素噪音的现有攻击方法不同。 这样, 在不牺牲白箱攻击的性能的情况下, 我们的对抗性例子可以具有很强的可转移性。 具体地说, 我们引入了一个放大系数, 使每个循环的级大小受到影响, 不同的DNNNNP普遍侧重于不同的歧视区域。 受此驱动, 我们的计算方法一般可以与任何以梯度为基础的攻击方法相融合。 与当前状态- 艺术攻击相比, 我们大幅提高了防御模型的成功率9. 2 和正常训练模型3. 7 。 我们的代码可以在项目内核/ SGI/ / QRANS/ 。