Stochastic Neural Networks (SNNs) that inject noise into their hidden layers have recently been shown to achieve strong robustness against adversarial attacks. However, existing SNNs are usually heuristically motivated, and often rely on adversarial training, which is computationally costly. We propose a new SNN that achieves state-of-the-art performance without relying on adversarial training, and enjoys solid theoretical justification. Specifically, while existing SNNs inject learned or hand-tuned isotropic noise, our SNN learns an anisotropic noise distribution to optimize a learning-theoretic bound on adversarial robustness. We evaluate our method on a number of popular benchmarks, show that it can be applied to different architectures, and that it provides robustness to a variety of white-box and black-box attacks, while being simple and fast to train compared to existing alternatives.
翻译:将噪音注入其隐蔽层层的隐性神经网络(SNN)最近被证明能对对抗性攻击形成强大的强力。然而,现有的SNN通常具有超自然动机,而且往往依赖对抗性训练,而这种训练在计算上成本很高。我们提议一个新的SNN,在不依赖对抗性训练的情况下实现最先进的性能,并享有坚实的理论依据。具体地说,虽然现有的SNNN注入了学习或手调的异地噪音,但我们SNN学会了一种厌养性噪音,以优化对对抗性攻击的学习理论约束。我们根据一些流行基准评估了我们的方法,表明它可以适用于不同的结构,它为各种白箱和黑盒攻击提供了强大的性能,而与现有的替代方法相比,它既简单又快速地培训。