The generalization performance of kernel ridge regression (KRR) exhibits a multi-phased pattern that crucially depends on the scaling relationship between the sample size $n$ and the underlying dimension $d$. This phenomenon is due to the fact that KRR sequentially learns functions of increasing complexity as the sample size increases; when $d^{k-1}\ll n\ll d^{k}$, only polynomials with degree less than $k$ are learned. In this paper, we present sharp asymptotic characterization of the performance of KRR at the critical transition regions with $n \asymp d^k$, for $k\in\mathbb{Z}^{+}$. Our asymptotic characterization provides a precise picture of the whole learning process and clarifies the impact of various parameters (including the choice of the kernel function) on the generalization performance. In particular, we show that the learning curves of KRR can have a delicate "double descent" behavior due to specific bias-variance trade-offs at different polynomial scaling regimes.
翻译:内核脊回归(KRR)的概括性表现呈现出一种多阶段模式,这关键取决于抽样规模($n)和基本维度($d$)之间的比例关系。这一现象的原因是,KRR随着抽样规模的增加,依次学习越来越复杂的功能;当美元(däk-1 ⁇ ll näll d ⁇ k})学习到的只有程度低于美元(k$)的多元面值。在本文中,我们展示了KRR在关键过渡区域的表现的简单性描述,以美元(sasymp d ⁇ k$)为基面($)为基面($k\in\in\mathbb ⁇ )为基面($)。我们的统计性描述提供了整个学习过程的准确情况,并澄清了各种参数(包括内核函数的选择)对总体性表现的影响。特别是,我们表明KRR的学习曲线可以具有微妙的“双重血统”行为,因为不同多面缩缩制的特定偏差。