Deep residual network architectures have been shown to achieve superior accuracy over classical feed-forward networks, yet their success is still not fully understood. Focusing on massively over-parameterized, fully connected residual networks with ReLU activation through their respective neural tangent kernels (ResNTK), we provide here a spectral analysis of these kernels. Specifically, we show that, much like NTK for fully connected networks (FC-NTK), for input distributed uniformly on the hypersphere $\mathbb{S}^{d-1}$, the eigenfunctions of ResNTK are the spherical harmonics and the eigenvalues decay polynomially with frequency $k$ as $k^{-d}$. These in turn imply that the set of functions in their Reproducing Kernel Hilbert Space are identical to those of FC-NTK, and consequently also to those of the Laplace kernel. We further show, by drawing on the analogy to the Laplace kernel, that depending on the choice of a hyper-parameter that balances between the skip and residual connections ResNTK can either become spiky with depth, as with FC-NTK, or maintain a stable shape.
翻译:深残网络结构已经显示,可以实现古典进料推进网络的更精准性,但是其成功仍然不能完全理解。我们在此提供对这些内核的光谱分析。具体地说,我们显示,与完全连接网络(FC-NTK)的NTK一样,对于在超视距上统一分布的投入,ResNTK的机能是球形协调器和机能价值以频率为$k ⁇ -d}的频率腐蚀多球形网络,以超光谱相和完全连接的剩余网络为主。这反过来意味着,它们生成的Kernel Hilbert空间的一套功能与FC-NTK的功能相同,因此也与Laplet内核的功能相同。我们通过与Laplet内核的类比,进一步显示,视选择的超准度调度大小,视超准度、超偏差或残余值以美元/美元/美元/美元/美元/日元的频率腐蚀。这又意味着,它们生产的KHilbert空间与F-NTK的深度之间的平衡可以保持。