Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.
翻译:先前的工作已经显示, 一个包含校正线性单元( ReLU) 激活功能的神经网络会导致输入空间的 convex 多元分解。 这些分解可以用一个双向图形来表示, 该双向图显示, 该双向图与多向和多向相对应的边缘相对应, 这是一条Hamming 图形的子图。 本文展示了如何使用双向图来检测和分析数字图像背景下的对称攻击。 当图像通过包含 ReLU 节点的网络传递时, 一个节点的发射或非反向分解可以被略为编码( $$ 用于RELU 激活, $ $0$ 用于不活动。 所有位数向导的序列将图像与一个位矢量相对应, 从而将其与一个双向图中的对称相匹配。 我们识别了在非对称和对称图像的正对称的正对称, 并检查这些对称的对称的对称的对称的对立度和对称的对称的对称的对称结构的对称的对称, 将数据对称的对称的对称的对称的对称的对称的对称结构的对称的对称和对称的对称结构的对称的对称的对称的对称的对称的对称的对称的对称结构的对称的对称的对称的对称和对称的对称的对称的对称的对称的对称的对称的对称的对称的对称的对称方法, 。