Neural networks are very successful at detecting patterns in noisy data, and have become the technology of choice in many fields. However, their usefulness is hampered by their susceptibility to adversarial attacks. Recently, many methods for measuring and improving a network's robustness to adversarial perturbations have been proposed, and this growing body of research has given rise to numerous explicit or implicit notions of robustness. Connections between these notions are often subtle, and a systematic comparison between them is missing in the literature. In this paper we begin addressing this gap, by setting up general principles for the empirical analysis and evaluation of a network's robustness as a mathematical property - during the network's training phase, its verification, and after its deployment. We then apply these principles and conduct a case study that showcases the practical benefits of our general approach.
翻译:神经网络非常成功地探测了噪音数据的模式,并已成为许多领域的选择技术。然而,由于容易受到对抗性攻击,网络的效用受到阻碍。最近,提出了许多衡量和改进网络对对抗性干扰的稳健性的方法。最近,提出了许多衡量和改进网络对对抗性扰动的稳健性的方法,这一不断增长的研究体系产生了许多明确或隐含的稳健性概念。这些概念之间的联系往往是微妙的,文献中缺乏对这些概念的系统比较。在本文件中,我们开始解决这一差距,在网络的培训阶段、核查和部署之后,为网络作为数学属性的稳健性进行实证分析和评价制定一般原则。然后,我们运用这些原则,进行案例研究,展示我们总体方法的实际好处。