通过差异下调提高对等袭击的可转让性 (Enhancing the Transferability of Adversarial Attacks through Variance Tuning)

Deep neural networks are vulnerable to adversarial examples that mislead the models with imperceptible perturbations. Though adversarial attacks have achieved incredible success rates in the white-box setting, most existing adversaries often exhibit weak transferability in the black-box setting, especially under the scenario of attacking models with defense mechanisms. In this work, we propose a new method called variance tuning to enhance the class of iterative gradient based attack methods and improve their attack transferability. Specifically, at each iteration for the gradient calculation, instead of directly using the current gradient for the momentum accumulation, we further consider the gradient variance of the previous iteration to tune the current gradient so as to stabilize the update direction and escape from poor local optima. Empirical results on the standard ImageNet dataset demonstrate that our method could significantly improve the transferability of gradient-based adversarial attacks. Besides, our method could be used to attack ensemble models or be integrated with various input transformations. Incorporating variance tuning with input transformations on iterative gradient-based attacks in the multi-model setting, the integrated method could achieve an average success rate of 90.1% against nine advanced defense methods, improving the current best attack performance significantly by 85.1% . Code is available at https://github.com/JHL-HUST/VT.

翻译：深心神经网络很容易受到以无法察觉的干扰来误导模型的对抗性例子。虽然对抗性攻击在白箱设置中取得了令人难以置信的成功率,但大多数现有对手在黑箱设置中往往表现出薄弱的可转移性,特别是在使用防御机制攻击模型的情况下。在这项工作中,我们提议了一种叫做差异调的新方法,以提升迭代梯度攻击方法的等级,并提高其攻击性转移性。具体地说,在计算梯度的每一次迭代时,而不是直接使用当前梯度进行动力积累,我们进一步考虑以前的迭代的梯度差异,以调整当前梯度,从而稳定更新方向并摆脱当地落后的奥地马。标准图像网络数据集的经验性结果表明,我们的方法可以大大改善基于梯度的对抗性攻击的可转移性。此外,我们的方法可以用来攻击基于迭代梯度攻击的模型或与各种输入变异性。在多模型设置的迭代梯度攻击中采用差异调,我们进一步考虑以前的迭代梯度调整方法的梯度差异性变化,以便稳定当前梯度的梯度变化率平均达到90.1%,从而稳定更新方向,从而摆脱落后的偏向当地偏向偏向偏向偏向偏向偏向偏向当地偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向偏向。标准的偏向偏向偏向偏向。标准的图像偏向。标准的图像偏向。标准的图像。标准的图像。标准的图像。标准的图像,在标准的图像,标准图像网格,在标准的图像网格数据网格数据库中显示标准数据集中,可以大大改进到最偏向式的路径图式数据集中,可以大大改进了以85.