Under mild conditions on the network initialization we derive a power series expansion for the Neural Tangent Kernel (NTK) of arbitrarily deep feedforward networks in the infinite width limit. We provide expressions for the coefficients of this power series which depend on both the Hermite coefficients of the activation function as well as the depth of the network. We observe faster decay of the Hermite coefficients leads to faster decay in the NTK coefficients and explore the role of depth. Using this series, first we relate the effective rank of the NTK to the effective rank of the input-data Gram. Second, for data drawn uniformly on the sphere we study the eigenvalues of the NTK, analyzing the impact of the choice of activation function. Finally, for generic data and activation functions with sufficiently fast Hermite coefficient decay, we derive an asymptotic upper bound on the spectrum of the NTK.
翻译:在网络初始化的温和条件下,我们在无限宽度范围内,为任意深度进料向前网络的神经唐氏内核(NTK)获得一个电源序列扩展。我们提供了该电源序列的系数的表达方式,这些系数既取决于激活功能的赫米特系数,也取决于网络的深度。我们观察到赫米特系数加速衰减,导致NTK系数加速衰减,并探索深度的作用。首先,利用这一序列,我们将NTK的有效等级与输入数据Gram的有效等级联系起来。第二,对于我们研究领域统一的数据,我们研究NTK的精度值,分析激活功能选择的影响。最后,对于通用数据和激活功能,如果赫米特系数衰减速度足够快,我们得出一个与NTK频谱的无症状上限。