Artificial Neural Networks (ANNs) are being deployed on an increasing number of safety-critical applications, including autonomous cars and medical diagnosis. However, concerns about their reliability have been raised due to their black-box nature and apparent fragility to adversarial attacks. Here, we develop and evaluate a symbolic verification framework using incremental model checking (IMC) and satisfiability modulo theories (SMT) to check for vulnerabilities in ANNs. More specifically, we propose several ANN-related optimizations for IMC, including invariant inference via interval analysis and the discretization of non-linear activation functions. With this, we can provide guarantees on the safe behavior of ANNs implemented both in floating-point and fixed-point (quantized) arithmetic. In this regard, our verification approach was able to verify and produce adversarial examples for 52 test cases spanning image classification and general machine learning applications. For small- to medium-sized ANN, our approach completes most of its verification runs in minutes. Moreover, in contrast to most state-of-the-art methods, our approach is not restricted to specific choices of activation functions or non-quantized representations.
翻译:人工神经网络(ANNs)正在被部署用于越来越多的安全关键应用,包括自主汽车和医疗诊断,然而,由于这些应用的黑箱性质和对对抗性攻击的明显脆弱性,人们对其可靠性提出了关切;在这里,我们利用渐进模式检查(IMC)和讽刺性模版理论(SMT)来开发和评价一个象征性的核查框架,以检查非本国公民的弱点;更具体地说,我们提议为IMC提供一些与ANN有关的优化,包括通过间隔分析和非线性激活功能的离散来进行不定的推断;因此,我们可以对非本国公民在浮动点和固定点(定量)算术中的安全行为提供保证;在这方面,我们的核查方法能够核查52个涉及图像分类和一般机器学习应用的测试案例,并产生对抗性实例;对于中小型非本国公民,我们的方法大部分核查过程在几分钟内完成。此外,与大多数最先进的方法不同,我们的方法并不局限于具体选择启动功能或非立式的演示。