We consider dynamical and geometrical aspects of deep learning. For many standard choices of layer maps we display semi-invariant metrics which quantify differences between data or decision functions. This allows us, when considering random layer maps and using non-commutative ergodic theorems, to deduce that certain limits exist when letting the number of layers tend to infinity. We also examine the random initialization of standard networks where we observe a surprising cut-off phenomenon in terms of the number of layers, the depth of the network. This could be a relevant parameter when choosing an appropriate number of layers for a given learning task, or for selecting a good initialization procedure. More generally, we hope that the notions and results in this paper can provide a framework, in particular a geometric one, for a part of the theoretical understanding of deep neural networks.
翻译:我们考虑深层学习的动态和几何方面。 对于层图的许多标准选择,我们展示了量化数据或决定功能差异的半变量度量。这使我们能够在考虑随机层图时,使用非混合的ERgodic定理仪,推断在让层数变得无穷无尽时存在某些限制。我们还研究标准网络的随机初始化,我们从层数和网络深度的角度观察出一个惊人的截断现象。在为特定学习任务或选择良好的初始化程序选择适当的层数时,这可能是一个相关的参数。更一般地说,我们希望本文中的概念和结果能够提供一个框架,特别是几何框架,用于对深神经网络的理论理解的一部分。