In this paper, we study the infinite-depth limit of finite-width residual neural networks with random Gaussian weights. With proper scaling, we show that by fixing the width and taking the depth to infinity, the pre-activations converge in distribution to a zero-drift diffusion process. Unlike the infinite-width limit where the pre-activation converge weakly to a Gaussian random variable, we show that the infinite-depth limit yields different distributions depending on the choice of the activation function. We document two cases where these distributions have closed-form (different) expressions. We further show an intriguing change of regime phenomenon of the post-activation norms when the width increases from 3 to 4. Lastly, we study the sequential limit infinite-depth-then-infinite-width and compare it with the more commonly studied infinite-width-then-infinite-depth limit.
翻译:在本文中, 我们研究有随机高斯重量的有限维度残余神经网络的无限深度限制。 我们通过适当缩放, 显示通过固定宽度和深度到无限度, 预激活会以分布为零驱动扩散过程。 与无限维度的限制不同, 预激活会微弱地聚集到高斯随机变量, 我们显示无限深度的限制会根据激活功能的选择而产生不同的分布。 我们记录了两种情况, 这些分布有封闭式( 不同) 表达方式。 我们进一步显示, 当宽度从 3 到 4 增加时, 激活后规范的体系现象会发生有趣的变化。 最后, 我们研究无深度的连续限制, 并将其与通常研究的无限维度- 异度- 极点- 极点- 点- 点- 点- 比较。