Gradient descent during the learning process of a neural network can be subject to many instabilities. The spectral density of the Jacobian is a key component for analyzing robustness. Following the works of Pennington et al., such Jacobians are modeled using free multiplicative convolutions from Free Probability Theory. We present a reliable and very fast method for computing the associated spectral densities. This method has a controlled and proven convergence. Our technique is based on an adaptative Newton-Raphson scheme, by finding and chaining basins of attraction: the Newton algorithm finds contiguous lilypad-like basins and steps from one to the next, heading towards the objective. We demonstrate the applicability of our method by using it to assess how the learning process is affected by network depth, layer widths and initialization choices: empirically, final test losses are very correlated to our Free Probability metrics.
翻译:神经网络学习过程中的渐渐下降可受到许多不稳定因素的影响。 Jacobian 的光谱密度是分析稳健性的一个关键组成部分。 在Pennington等人的作品之后,这些Jacobian人使用自由概率理论的免费倍增演化模型进行建模。 我们为计算相关的光谱密度提出了一个可靠和非常快速的方法。 这种方法具有控制和证明的趋同性。 我们的技术基于适应性牛顿- Raphson 计划,通过寻找和链绑吸引盆地: Newton 算法发现毗连的利巴德相似的盆地和从一个到另一个步骤,朝着目标前进。 我们通过使用它来评估学习过程如何受到网络深度、 层宽度和初始选择的影响来展示我们的方法的实用性: 从经验上看,最终测试损失与我们的自由概率衡量非常相关。