Owen and Hoyt recently showed that the effective dimension offers key structural information about the input-output mapping underlying an artificial neural network. Along this line of research, this work proposes an estimation procedure that allows the calculation of the mean dimension from a given dataset, without resampling from external distributions. The design yields total indices when features are independent and a variant of total indices when features are correlated. We show that this variant possesses the zero independence property. With synthetic datasets, we analyse how the mean dimension evolves layer by layer and how the activation function impacts the magnitude of interactions. We then use the mean dimension to study some of the most widely employed convolutional architectures for image recognition (LeNet, ResNet, DenseNet). To account for pixel correlations, we propose calculating the mean dimension after the addition of an inverse PCA layer that allows one to work on uncorrelated PCA-transformed features, without the need to retrain the neural network. We use the generalized total indices to produce heatmaps for post-hoc explanations, and we employ the mean dimension on the PCA-transformed features for cross comparisons of the artificial neural networks structures. Results provide several insights on the difference in magnitude of interactions across the architectures, as well as indications on how the mean dimension evolves during training.
翻译:欧文 和 Hoyt 最近显示, 有效维度为人工神经网络的输入- 输出绘图提供了关键的结构信息 。 在研究的这一行中, 这项工作提出了一个估算程序, 允许在不从外部分布中抽取的情况下, 从一个特定数据集中计算平均维度, 而不从外部分布中抽取。 设计产生总指数, 当特征是独立的时, 并且当特征相关时, 生成总指数的变量; 我们显示该变量拥有零独立属性。 我们通过合成数据集, 分析平均维度如何通过层层层来演变, 以及激活功能如何影响互动的程度。 然后我们用平均值来研究一些最广泛使用的图像识别共振结构( LeNet、 ResNet、 DenseNet Net ) 。 为了解析像素相关关系, 我们提议在增加一个反向的五氯苯甲醚层之后再计算平均维度。 我们使用通用总维度来生成热映射图, 用于后级解释, 我们用该等维度的内层的中, 在人造神经结构的深度结构中, 提供不同层次的横向对比中, 。