We find that different Deep Neural Networks (DNNs) trained with the same dataset share a common principal subspace in latent spaces, no matter in which architectures (e.g., Convolutional Neural Networks (CNNs), Multi-Layer Preceptors (MLPs) and Autoencoders (AEs)) the DNNs were built or even whether labels have been used in training (e.g., supervised, unsupervised, and self-supervised learning). Specifically, we design a new metric $\mathcal{P}$-vector to represent the principal subspace of deep features learned in a DNN, and propose to measure angles between the principal subspaces using $\mathcal{P}$-vectors. Small angles (with cosine close to $1.0$) have been found in the comparisons between any two DNNs trained with different algorithms/architectures. Furthermore, during the training procedure from random scratch, the angle decrease from a larger one ($70^\circ-80^\circ$ usually) to the small one, which coincides the progress of feature space learning from scratch to convergence. Then, we carry out case studies to measure the angle between the $\mathcal{P}$-vector and the principal subspace of training dataset, and connect such angle with generalization performance. Extensive experiments with practically-used Multi-Layer Perceptron (MLPs), AEs and CNNs for classification, image reconstruction, and self-supervised learning tasks on MNIST, CIFAR-10 and CIFAR-100 datasets have been done to support our claims with solid evidences. Interpretability of Deep Learning, Feature Learning, and Subspaces of Deep Features
翻译:我们发现,接受过同一数据集培训的不同深神经网络(DNN)在潜空中共享一个共同的主要子空间,而无论建筑(例如,Cavulal神经网络(CNNs),多导体(MLPs)和Autoencoders(AEs)),DNN(DNNS)的构建,或甚至是否在培训中使用了标签(例如,受监督、不受监督和自控的学习)。具体地说,我们设计了一个新的标准分类$mathcal{P}$Vocl,以代表在 DNNNE中学习的深功能的主要子空间的主要子空间(例如,Culvaal Neal 网络(CNNE)、多导体导体(MLP)和AUAVCS)之间的角度。小角度(CO值接近1.0美元),在任何两个受过不同算法/结构训练的 DNNS(例如随机抓取、从一个更大的超超导-cal-cal-cal-cal-cal-al-al-Idealalal $Preval) vial Stal view Stal view viewness 和一个我们从直观的自我学习的自我变换进进到一个剖算。