Kernel ridge regression (KRR) has recently attracted renewed interest due to its potential for explaining the transient effects, such as double descent, that emerge during neural network training. In this work, we study how the alignment between the target function and the kernel affects the performance of the KRR. We focus on the truncated KRR (TKRR) which utilizes an additional parameter that controls the spectral truncation of the kernel matrix. We show that for polynomial alignment, there is an \emph{over-aligned} regime, in which TKRR can achieve a faster rate than what is achievable by full KRR. The rate of TKRR can improve all the way to the parametric rate, while that of full KRR is capped at a sub-optimal value. This shows that target alignemnt can be better leveraged by utilizing spectral truncation in kernel methods. We also consider the bandlimited alignment setting and show that the regularization surface of TKRR can exhibit transient effects including multiple descent and non-monotonic behavior. Our results show that there is a strong and quantifable relation between the shape of the \emph{alignment spectrum} and the generalization performance of kernel methods, both in terms of rates and in finite samples.
翻译:最近,由于有可能解释神经网络培训期间出现的瞬变效应,例如双向下降,因此引起了新的兴趣。在这项工作中,我们研究了目标函数和内核之间的对齐如何影响KRR的性能。我们把重点放在了疏漏的KRR(TKRR)上,它利用了一个额外的参数来控制内核矩阵的光谱疏松。我们发现,对于多核对齐制度,存在一个调带带宽的调整设置,并显示TRKR的正规表面可以显示瞬变效果,包括多种血统和非运动行为。TKRR的速率可以一路提高到参数速率,而完整的KRRR的速率则以亚最佳值封住。这表明,在内核矩阵中,利用光谱疏疏松来更好地利用目标对齐点。我们还考虑到带宽的调整设置,并表明TRKR的正规化表面可以显示包括多种血统和非运动行为在内的瞬变效果。我们的结果显示,在形状和定式的频谱中,其表现率和定式的频谱系关系是强的。