We study the distribution of singular values of product of random matrices pertinent to the analysis of deep neural networks. The matrices resemble the product of the sample covariance matrices, however, an important difference is that the population covariance matrices assumed to be non-random or random but independent of the random data matrix in statistics and random matrix theory are now certain functions of random data matrices (synaptic weight matrices in the deep neural network terminology). The problem has been treated in recent work [25, 13] by using the techniques of free probability theory. Since, however, free probability theory deals with population covariance matrices which are independent of the data matrices, its applicability has to be justified. The justification has been given in [22] for Gaussian data matrices with independent entries, a standard analytical model of free probability, by using a version of the techniques of random matrix theory. In this paper we use another, more streamlined, version of the techniques of random matrix theory to generalize the results of [22] to the case where the entries of the synaptic weight matrices are just independent identically distributed random variables with zero mean and finite fourth moment. This, in particular, extends the property of the so-called macroscopic universality on the considered random matrices.
翻译:我们研究了与深神经网络分析相关的随机矩阵产品产品的单值分布情况。这些矩阵类似于样本共变矩阵的产物,然而,一个重要的区别是,在统计和随机矩阵理论中,人口共变矩阵假定为非随机或随机的,但独立于随机数据矩阵,在统计和随机矩阵理论中,人口共变矩阵假定为随机数据矩阵的某些功能(深神经网络术语中的合成权重矩阵)。在最近的工作中,使用自由概率理论技术处理了这一问题[25、13]。然而,由于自由概率理论涉及独立于数据矩阵的人口共变矩阵,因此其适用性是有道理的。在[22]中,对带有独立条目的高斯数据矩阵提供了理由,这是一个自由概率的标准分析模型,使用了随机矩阵理论技术的版本。在本文中,我们使用另一个更简便的随机矩阵理论技术版本,将[22]的结果归纳到一个案例,即合成权重矩阵的条目是完全独立的、与零平均值和定数第四时刻的随机变量。这特别扩大了所考虑的宏观矩阵的普遍性。