The existing work shows that the neural network trained by naive gradient-based optimization method is prone to adversarial attacks, adds small malicious on the ordinary input is enough to make the neural network wrong. At the same time, the attack against a neural network is the key to improving its robustness. The training against adversarial examples can make neural networks resist some kinds of adversarial attacks. At the same time, the adversarial attack against a neural network can also reveal some characteristics of the neural network, a complex high-dimensional non-linear function, as discussed in previous work. In This project, we develop a first-order method to attack the neural network. Compare with other first-order attacks, our method has a much higher success rate. Furthermore, it is much faster than second-order attacks and multi-steps first-order attacks.
翻译:现有工作表明,通过天真的梯度优化方法培训的神经网络容易遭到对抗性攻击,在普通输入上增加小的恶意,足以使神经网络错误。与此同时,对神经网络的攻击是提高其强力的关键。对抗性例子的训练可以使神经网络抵制某些类型的对抗性攻击。同时,对神经网络的对抗性攻击也可以揭示神经网络的某些特征,如先前工作所讨论的那样,这是一种复杂的高维非线性功能。在这个项目中,我们开发了攻击神经网络的第一阶方法。与其他一阶攻击相比,我们的方法的成功率要高得多。此外,它比二阶攻击和多阶一阶攻击的速度要快得多。