In this paper, we study the infinite-depth limit of finite-width residual neural networks with random Gaussian weights. With proper scaling, we show that by fixing the width and taking the depth to infinity, the vector of pre-activations converges in distribution to a zero-drift diffusion process. Unlike the infinite-width limit where the pre-activation converge weakly to a Gaussian random variable, we show that the infinite-depth limit yields different distributions depending on the choice of the activation function. We document two cases where these distributions have closed-form (different) expressions. We further show an intriguing phase-transition phenomenon of the post-activation norms when the width increases from 3 to 4. Lastly, we study the sequential limit infinite-depth-then-infinite-width, and show some key differences with the more commonly studied infinite-width-then-infinite-depth limit.
翻译:在本文中, 我们研究有随机高斯重量的有限维度残余神经网络的无限深度限制。 我们通过适当缩放, 显示通过固定宽度和深度到无限度, 预激活的矢量会以分布为零驱动扩散过程。 与无限维度的限制不同, 预激活会以微弱的形式聚集到高斯随机变量中, 我们显示无限深度的限制会根据激活功能的选择产生不同的分布。 我们记录了两种情况, 这些分布有封闭式( 不同) 表达方式。 我们进一步显示, 当宽度从 3 到 4 增长时, 活动前的矢量会呈现出一种有趣的阶段过渡性标准。 最后, 我们研究无限深度的连续限制, 并且显示一些与通常研究的 无限维度( 无限) 和 无限深度的 。