Although great progress has been made on adversarial attacks for deep neural networks (DNNs), their transferability is still unsatisfactory, especially for targeted attacks. There are two problems behind that have been long overlooked: 1) the conventional setting of $T$ iterations with the step size of $\epsilon/T$ to comply with the $\epsilon$-constraint. In this case, most of the pixels are allowed to add very small noise, much less than $\epsilon$; and 2) usually manipulating pixel-wise noise. However, features of a pixel extracted by DNNs are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. To tackle these issues, our previous work proposes a patch-wise iterative method (PIM) aimed at crafting adversarial examples with high transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $\epsilon$-constraint is properly assigned to its surrounding regions by a project kernel. But targeted attacks aim to push the adversarial examples into the territory of a specific class, and the amplification factor may lead to underfitting. Thus, we introduce the temperature and propose a patch-wise++ iterative method (PIM++) to further improve transferability without significantly sacrificing the performance of the white-box attack. Our method can be generally integrated to any gradient-based attack methods. Compared with the current state-of-the-art attack methods, we significantly improve the success rate by 33.1\% for defense models and 31.4\% for normally trained models on average.
翻译:尽管在对深神经网络(DNN)的对抗性攻击方面取得了很大进展,但其可转移性仍然不尽如人意,特别是针对目标的攻击。背后有两个问题长期被忽视:(1) 传统设置的T$迭代,其职档大小为$\epsilon/T$,以遵守$\epsilon$-strain。在本案中,大多数像素允许添加非常小的噪音,大大低于$\epsilon美元;和(2) 通常操纵像素的噪音。然而,DNNS提取的像素特征受到其周围区域的影响,而不同的DNNNS通常侧重于不同的歧视区域。为了解决这些问题,我们以前的工作提出了一种偏颇的迭代方法,目的是以高可转移性的方式设计对抗性实例。具体地,我们为每升一步的音序添加一个振动系数,一个总梯度,以美元/epslon-cretal-creferality攻击速度模型,我们通过一个不甚精确的相对性的方法,将一个目标攻击方法推向一个特定的平整。