The paper deals with the distribution of singular values of the input-output Jacobian of deep untrained neural networks in the limit of their infinite width. The Jacobian is the product of random matrices where the independent rectangular weight matrices alternate with diagonal matrices whose entries depend on the corresponding column of the nearest neighbor weight matrix. The problem was considered in \cite{Pe-Co:18} for the Gaussian weights and biases and also for the weights that are Haar distributed orthogonal matrices and Gaussian biases. Basing on a free probability argument, it was claimed that in these cases the singular value distribution of the Jacobian in the limit of infinite width (matrix size) coincides with that of the analog of the Jacobian with special random but weight independent diagonal matrices, the case well known in random matrix theory. The claim was rigorously proved in \cite{Pa-Sl:21} for a quite general class of weights and biases with i.i.d. (including Gaussian) entries by using a version of the techniques of random matrix theory. In this paper we use another version of the techniques to justify the claim for random Haar distributed weight matrices and Gaussian biases.
翻译:本文涉及深度未受过训练的神经网络在无限宽度范围内的输入-输出 Jacobian 的单值分布。 Jacobian 是随机矩阵的产物, 其中独立的矩形重量矩阵在无限宽限(矩阵大小)内与对角矩阵的交替, 其条目取决于相邻重量矩阵对应的柱子。 这个问题在高山重量和偏向以及Haar分布的正方形矩阵和高斯偏向的重量中都得到了考虑。 基于自由概率论, 据称在这些情况下, Jacobian 在无限宽限( 矩阵大小) 中的单值分布与Jacobian 的类同, 其条目的条目取决于相近邻重量矩阵的对应列。 这个问题在随机矩阵理论中得到了考虑。 在\ cite {Pa- Sl:21} 中, 索赔得到了严格证实, 涉及相当一般的重量和偏差的等级。 (包括高斯) 条目使用随机矩阵理论的版本。 在本文中, 我们使用另一个随机矩阵模型来证明, 高质矩阵的模型的模型是另一个随机模型。