Deep neural networks are vulnerable to adversarial attacks. Most $L_{0}$-norm based white-box attacks craft perturbations by the gradient of models to the input. Since the computation cost and memory limitation of calculating the Hessian matrix, the application of Hessian or approximate Hessian in white-box attacks is gradually shelved. In this work, we note that the sparsity requirement on perturbations naturally lends itself to the usage of Hessian information. We study the attack performance and computation cost of the attack method based on the Hessian with a limited number of perturbation pixels. Specifically, we propose the Limited Pixel BFGS (LP-BFGS) attack method by incorporating the perturbation pixel selection strategy and the BFGS algorithm. Pixels with top-k attribution scores calculated by the Integrated Gradient method are regarded as optimization variables of the LP-BFGS attack. Experimental results across different networks and datasets demonstrate that our approach has comparable attack ability with reasonable computation in different numbers of perturbation pixels compared with existing solutions.
翻译:深度神经网络容易受到对抗性攻击的影响。大多数基于$L_{0}$范数的白盒攻击是根据模型对输入的梯度制作扰动的。由于计算Hessian矩阵的计算成本和内存限制,白盒攻击中的Hessian或近似Hessian的应用逐渐被搁置。在本文中,我们指出扰动的稀疏性要求自然借助于Hessian信息的使用。我们研究了基于受限数量扰动像素的Hessian攻击方法的攻击性能和计算成本。具体而言,我们提出了Limited Pixel BFGS (LP-BFGS) 攻击方法,通过将扰动像素选择策略和BFGS算法相结合。由Integrated Gradient方法计算的前k个贡献分数最高的像素被视为是LP-BFGS攻击的优化变量。在不同的网络和数据集上的实验证明,与现有解决方案相比,我们的方法具有可比较的攻击能力,并且在不同的扰动像素数量下具有合理的计算量。