Neural networks have been used successfully in a variety of fields, which has led to a great deal of interest in developing a theoretical understanding of how they store the information needed to perform a particular task. We study the weight matrices of trained deep neural networks using methods from random matrix theory (RMT) and show that the statistics of most of the singular values follow universal RMT predictions. This suggests that they are random and do not contain system specific information, which we investigate further by comparing the statistics of eigenvector entries to the universal Porter-Thomas distribution. We find that for most eigenvectors the hypothesis of randomness cannot be rejected, and that only eigenvectors belonging to the largest singular values deviate from the RMT prediction, indicating that they may encode learned information. In addition, a comparison with RMT predictions also allows to distinguish networks trained in different learning regimes - from lazy to rich learning. We analyze the spectral distribution of the large singular values using the Hill estimator and find that the distribution cannot in general be characterized by a tail index, i.e. is not of power law type.
翻译:神经网络被成功地应用于多个领域,这导致人们非常感兴趣地从理论上理解它们如何储存执行某项任务所需的信息。我们使用随机矩阵理论(RMT)的方法研究受过训练的深神经网络的重量矩阵,并表明大多数单值的统计都遵循通用RMT预测。这表明这些单值是随机的,不包含系统特定信息,我们通过比较脑基因条目与通用Porter-Thoomas分布的统计来进一步调查这些信息。我们发现,对于大多数脑生生物来说,随机性假设是不可拒绝的,只有属于最大单值的精子,才偏离RMT预测,表明它们可以编码学到的信息。此外,与RMT预测的比较还可以区分在不同学习体制中受过训练的网络,从懒到富学。我们用Hill估计器分析大型单值的光谱分布,发现该分布一般不能用尾值来描述,即没有法律类型。