Numerous researchers recently applied empirical spectral analysis to the study of modern deep learning classifiers. We identify and discuss an important formal class/cross-class structure and show how it lies at the origin of the many visually striking features observed in deepnet spectra, some of which were reported in recent articles, others are unveiled here for the first time. These include spectral outliers, "spikes", and small but distinct continuous distributions, "bumps", often seen beyond the edge of a "main bulk". The significance of the cross-class structure is illustrated in three ways: (i) we prove the ratio of outliers to bulk in the spectrum of the Fisher information matrix is predictive of misclassification, in the context of multinomial logistic regression; (ii) we demonstrate how, gradually with depth, a network is able to separate class-distinctive information from class variability, all while orthogonalizing the class-distinctive information; and (iii) we propose a correction to KFAC, a well-known second-order optimization algorithm for training deepnets.
翻译:许多研究人员最近将经验光谱分析应用于现代深层次学习分类的研究。我们确定和讨论一个重要的正式阶级/跨阶级结构,并展示深网光谱中观测到的许多目视突出特征的来源,其中一些在最近的文章中报告,另一些则首次在这里公布。其中包括光谱外线、“spikes”和小型但独特的连续分布,“bomps”,通常在“主要成份”边缘以外看到。跨阶级结构的意义通过三种方式加以说明:(一) 我们证明,在渔业信息矩阵的频谱中,外层对散数的比例是预测在多度物流回归背景下的错误分类;(二) 我们通过深度来表明,一个网络如何逐渐能够将分层信息与阶级变异性信息分开,同时或对分级信息进行分解;以及(三) 我们建议对KFAC进行校正,这是为人所知的用于培训深网的二级优化算法。