The vulnerability of deep neural networks to small and even imperceptible perturbations has become a central topic in deep learning research. Although several sophisticated defense mechanisms have been introduced, most were later shown to be ineffective. However, a reliable evaluation of model robustness is mandatory for deployment in safety-critical scenarios. To overcome this problem we propose a simple yet effective modification to the gradient calculation of state-of-the-art first-order adversarial attacks. Normally, the gradient update of an attack is directly calculated for the given data point. This approach is sensitive to noise and small local optima of the loss function. Inspired by gradient sampling techniques from non-convex optimization, we propose Dynamically Sampled Nonlocal Gradient Descent (DSNGD). DSNGD calculates the gradient direction of the adversarial attack as the weighted average over past gradients of the optimization history. Moreover, distribution hyperparameters that define the sampling operation are automatically learned during the optimization scheme. We empirically show that by incorporating this nonlocal gradient information, we are able to give a more accurate estimation of the global descent direction on noisy and non-convex loss surfaces. In addition, we show that DSNGD-based attacks are on average 35% faster while achieving 0.9% to 27.1% higher success rates compared to their gradient descent-based counterparts.
翻译:深心神经网络对小型甚至无法察觉的扰动的脆弱性已成为深层学习研究的一个中心议题。虽然引入了若干精密的防御机制,但后来发现大多数机制没有效果。然而,在安全危急情况下,对模型稳健性进行可靠的评估是必须的。为了克服这一问题,我们建议简单而有效地修改最先进的第一对立式攻击的梯度计算方法。通常,攻击的梯度更新是直接计算给定数据点的。这一方法对噪音和损失函数的小型本地节选法十分敏感。在非康韦克斯优化的梯度取样技术的启发下,我们提出动态抽样的非本地梯度后裔(DSNGD) 。DSNGD计算对抗攻击的梯度方向是比过去最优化历史的梯度的加权平均数。此外,在优化计划期间,用于确定取样操作的超度参数是自动学到的。我们从经验上表明,通过纳入非本地的梯度信息,我们能够更准确地估计全球的下降方向。在非康韦克斯优化的精度取样中,我们提议采用动态抽样的非本地的非梯度抽样取样方法,以35度测测测测测测测测测测测测得对抗的基率。此外,我们显示它们的平均度为35度为平均的基的基底压率率。我们显示的D-B度。我们显示了基调氏度。我们显示了比正正率。我们显示的B度。我们显示的B度为以0.1比的B度。我们显示的B度。我们显示的B度为35进率。