Gradient descent during the learning process of a neural network can be subject to many instabilities. The spectral density of the Jacobian is a key component for analyzing stability. Following the works of Pennington et al., such Jacobians are modeled using free multiplicative convolutions from Free Probability Theory (FPT). We present a reliable and very fast method for computing the associated spectral densities, for given architecture and initialization. This method has a controlled and proven convergence. Our technique is based on an homotopy method: it is an adaptative Newton-Raphson scheme which chains basins of attraction. In order to demonstrate the relevance of our method we show that the relevant FPT metrics computed before training are highly correlated to final test accuracies - up to 85\%. We also nuance the idea that learning happens at the edge of chaos by giving evidence that a very desirable feature for neural networks is the hyperbolicity of their Jacobian at initialization.
翻译:在神经网络学习过程中,神经网络的光谱密度是分析稳定性的关键组成部分。在Pennington等人的作品之后,这些雅各布人采用自由概率理论(FPT)的免费倍增演化模型。我们提出了一个可靠和非常快速的方法来计算相关的光谱密度,用于特定的构造和初始化。这个方法具有一种可控和经证明的趋同性。我们的技术基于一种同质方法:它是一种适应式的牛顿-拉夫森方案,将吸引盆地连结在一起。为了证明我们的方法的相关性,我们证明在培训之前计算的相关FPT测量值与最终测试进化(最高为85 ⁇ )。我们还细化了一种想法,即通过证明神经网络最理想的特征是初始化时的超偏向性,来在混乱边缘进行学习。