Formal verification has emerged as a promising method to ensure the safety and reliability of neural networks. However, many relevant properties, such as fairness or global robustness, pertain to the entire input space. If one applies verification techniques naively, the neural network is checked even on inputs that do not occur in the real world and have no meaning. To tackle this shortcoming, we propose the VeriFlow architecture as a flow-based density model tailored to allow any verification approach to restrict its search to some data distribution of interest. We argue that our architecture is particularly well suited for this purpose because of two major properties. First, we show that the transformation that is defined by our model is piecewise affine. Therefore, the model allows the usage of verifiers based on constraint solving with linear arithmetic. Second, upper density level sets (UDL) of the data distribution are definable via linear constraints in the latent space. As a consequence, representations of UDLs specified by a given probability are effectively computable in the latent space. This property allows for effective verification with a fine-grained, probabilistically interpretable control of how a-typical the inputs subject to verification are.
翻译:形式化验证已成为确保神经网络安全性与可靠性的一种有前景的方法。然而,许多相关属性(如公平性或全局鲁棒性)涉及整个输入空间。若直接应用验证技术,神经网络甚至会在现实世界中不存在且无意义的输入上进行检验。为解决这一缺陷,我们提出VeriFlow架构——一种基于流的密度模型,专为允许任何验证方法将其搜索范围限制在特定感兴趣的数据分布而设计。我们认为该架构特别适合此目的,主要基于两大特性:首先,我们证明该模型定义的变换是分段仿射的,因此支持基于线性算术约束求解的验证器使用;其次,数据分布的上密度水平集(UDL)可通过隐空间中的线性约束定义。这使得以给定概率指定的UDL表示在隐空间中可高效计算。该特性支持在验证过程中对输入的非典型程度进行细粒度、概率可解释的有效控制,从而实现高效验证。