Understanding the black-box representations in Deep Neural Networks (DNN) is an essential problem in deep learning. In this work, we propose Graph-Based Similarity (GBS) to measure the similarity of layer features. Contrary to previous works that compute the similarity directly on the feature maps, GBS measures the correlation based on the graph constructed with hidden layer outputs. By treating each input sample as a node and the corresponding layer output similarity as edges, we construct the graph of DNN representations for each layer. The similarity between graphs of layers identifies the correspondences between representations of models trained in different datasets and initializations. We demonstrate and prove the invariance property of GBS, including invariance to orthogonal transformation and invariance to isotropic scaling, and compare GBS with CKA. GBS shows state-of-the-art performance in reflecting the similarity and provides insights on explaining the adversarial sample behavior on the hidden layer space.
翻译:深神经网络(DNN) 的黑盒表达方式是深层学习中的一个基本问题。 在这项工作中,我们提出基于图形的相似性(GBS) 以测量层特征的相似性。 与直接计算地貌图上的相似性的先前工作相反, GBS 测量基于以隐藏层输出构建的图形的关联性。 通过将每个输入样本作为节点对待,并将相应的层输出结果与边缘相类似, 我们构建了每个层的 DNN 表达方式的图表。 层图的相似性确定了不同数据集和初始化模型的表达方式之间的对应性。 我们演示并证明GBS 的不一致性属性, 包括不易于正向变化的变异性, 以及不易于等地缩放, 并将 GBS 和 CKA 进行比较。 GBS 显示反映相似性时的状态表现, 并提供洞察关于隐层空间的对称抽样行为的洞察力。