Deep neural networks used for image classification often use convolutional filters to extract distinguishing features before passing them to a linear classifier. Most interpretability literature focuses on providing semantic meaning to convolutional filters to explain a model's reasoning process and confirm its use of relevant information from the input domain. Fully connected layers can be studied by decomposing their weight matrices using a singular value decomposition, in effect studying the correlations between the rows in each matrix to discover the dynamics of the map. In this work we define a singular value decomposition for the weight tensor of a convolutional layer, which provides an analogous understanding of the correlations between filters, exposing the dynamics of the convolutional map. We validate our definition using recent results in random matrix theory. By applying the decomposition across the linear layers of an image classification network we suggest a framework against which interpretability methods might be applied using hypergraphs to model class separation. Rather than looking to the activations to explain the network, we use the singular vectors with the greatest corresponding singular values for each linear layer to identify those features most important to the network. We illustrate our approach with examples and introduce the DeepDataProfiler library, the analysis tool used for this study.
翻译:用于图像分类的深神经网络往往使用进化过滤器,在将其传递到线性分类器之前,用进化过滤器来提取区分特征。大多数可解释性文献都侧重于为进化过滤器提供语义意义,以解释模型推理过程,并证实其使用输入域的相关信息。完全连接的层可以通过使用单值分解法分解其加权矩阵来进行研究。实际上,通过研究每个矩阵中行之间的关联,以发现地图的动态。在这项工作中,我们定义了一个单值分解特性,以便将其分解为卷层的重量分解,从而对过滤器之间的关联提供了类似的理解,暴露了同流图的动态。我们用随机矩阵理论中的最新结果验证了我们的定义。通过在图像分类网络的线性层中应用分解法,我们建议了一个框架,可以使用高光谱法来模拟类分解。我们不是寻求激活来解释网络,而是使用每个线性层中最大对应的单值的矢量矢量,用以识别网络中最重要的特征。我们用这个工具来说明我们所使用的工具分析。