Deep learning is vulnerable to adversarial examples. Many defenses based on randomized neural networks have been proposed to solve the problem, but fail to achieve robustness against attacks using proxy gradients such as the Expectation over Transformation (EOT) attack. We investigate the effect of the adversarial attacks using proxy gradients on randomized neural networks and demonstrate that it highly relies on the directional distribution of the loss gradients of the randomized neural network. We show in particular that proxy gradients are less effective when the gradients are more scattered. To this end, we propose Gradient Diversity (GradDiv) regularizations that minimize the concentration of the gradients to build a robust randomized neural network. Our experiments on MNIST, CIFAR10, and STL10 show that our proposed GradDiv regularizations improve the adversarial robustness of randomized neural networks against a variety of state-of-the-art attack methods. Moreover, our method efficiently reduces the transferability among sample models of randomized neural networks.
翻译:深层学习很容易受到对抗性实例的影响。 许多基于随机神经网络的防御系统已被提出来解决这个问题,但未能实现对使用代理性梯度(如 " 变换预期(EOT) " 攻击)攻击的稳健性。我们用随机神经网络的代理性梯度调查对抗性攻击的影响,并表明它高度依赖随机神经网络损失梯度的方向分布。我们特别显示,当梯度更加分散时,代用梯度效果更低。为此,我们提出了 " 渐变多样性(GradDiv) " 规范化办法,最大限度地减少梯度集中,以构建一个强大的随机神经网络。我们在MMIST、CIFAR10和STL10的实验显示,我们提议的 " 渐变 " 规律化神经网络对各种州级攻击方法的随机性神经网络的稳健性改善了对抗性。此外,我们的方法有效地降低了随机神经网络样本的可转移性。