We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity. Moreover, we obtain sufficient conditions in terms of the weight matrices and bias vectors at hidden layers for pointwise convergence of deep ReLU networks. These results provide mathematical insights to the design strategy of the well-known deep residual networks in image classification.
翻译:我们探索深神经网络与广受欢迎的RELU激活功能的融合,因为网络的深度往往是无限的。为此,我们引入了启动域的概念和启动RELU网络的矩阵。通过在激活域上以倍增来取代RELU激活功能的应用,我们获得了RELU网络的明确表达方式。然后我们确定ReLU网络的融合是一组无限矩阵产品的融合。正在研究这些无限的矩阵产品的融合所需的足够和必要条件。因此,我们为RELU网络创造了必要的条件,以便随着RELU网络的深度到无限化,重量矩阵的序列会汇合到身份矩阵,而偏向矢量的序列会趋同到零。此外,我们获得了在隐藏层的重量矩阵和偏向矢量的充分条件,以便深RELU网络的点性融合。这些结果为人们熟知的图像分类深层残余网络的设计战略提供了数学见解。