Free Probability Theory (FPT) provides rich knowledge for handling mathematical difficulties caused by random matrices that appear in research related to deep neural networks (DNNs), such as the dynamical isometry, Fisher information matrix, and training dynamics. FPT suits these researches because the DNN's parameter-Jacobian and input-Jacobian are polynomials of layerwise Jacobians. However, the critical assumption of asymptotic freenss of the layerwise Jacobian has not been proven completely so far. The asymptotic freeness assumption plays a fundamental role when propagating spectral distributions through the layers. Haar distributed orthogonal matrices are essential for achieving dynamical isometry. In this work, we prove asymptotic freeness of layerwise Jacobians of multilayer perceptron (MLP) in this case. A key of the proof is an invariance of the MLP. Considering the orthogonal matrices that fix the hidden units in each layer, we replace each layer's parameter matrix with itself multiplied by the orthogonal matrix, and then the MLP does not change. Furthermore, if the original weights are Haar orthogonal, the Jacobian is also unchanged by this replacement. Lastly, we can replace each weight with a Haar orthogonal random matrix independent of the Jacobian of the activation function using this key fact.
翻译:自由概率理论( FPT) 为处理与深神经网络有关的研究( DNNS) 中出现的随机矩阵引起的数学困难提供了丰富的知识。 FPT 适合这些研究, 因为 DNN 的参数- Jacobbian 和 pinp- Jacobbian 是多层Jacobtians 的多元体。 然而, 层性Jacobian 的无症状自由的关键假设还没有被完全证明。 无症状自由假设在通过层传播光谱分布时起着根本作用。 光分布的正方形矩阵对于实现动态等量测量至关重要。 在这项工作中, 我们证明多层感应( MLP ) 的分层分层自由。 证据的关键是 MLP 的变异性。 考虑到每个层中固定隐藏单位的正方位矩阵, 我们用每个层的光谱分布式矩阵来替换每个层的参数矩阵本身, 以正位数矩阵的原始重量或正态矩阵替换, 也用原始的硬度矩阵来计算。