The practical success of overparameterized neural networks has motivated the recent scientific study of interpolating methods, which perfectly fit their training data. Certain interpolating methods, including neural networks, can fit noisy training data without catastrophically bad test performance, in defiance of standard intuitions from statistical learning theory. Aiming to explain this, a body of recent work has studied benign overfitting, a phenomenon where some interpolating methods approach Bayes optimality, even in the presence of noise. In this work we argue that while benign overfitting has been instructive and fruitful to study, many real interpolating methods like neural networks do not fit benignly: modest noise in the training set causes nonzero (but non-infinite) excess risk at test time, implying these models are neither benign nor catastrophic but rather fall in an intermediate regime. We call this intermediate regime tempered overfitting, and we initiate its systematic study. We first explore this phenomenon in the context of kernel (ridge) regression (KR) by obtaining conditions on the ridge parameter and kernel eigenspectrum under which KR exhibits each of the three behaviors. We find that kernels with powerlaw spectra, including Laplace kernels and ReLU neural tangent kernels, exhibit tempered overfitting. We then empirically study deep neural networks through the lens of our taxonomy, and find that those trained to interpolation are tempered, while those stopped early are benign. We hope our work leads to a more refined understanding of overfitting in modern learning.
翻译:过度依赖的神经网络的实际成功促使了最近对内插方法的科学研究,这些方法完全适合其培训数据。 某些内插方法,包括神经网络,在无视统计学理论的标准直觉的情况下,可以安装噪音的训练数据,而不必进行灾难性的测试性能差,无视统计学理论的标准直觉。 为了解释这一点,最近一大批工作研究了良性过大的问题,即一些内插方法接近贝都因最佳性能的现象,即便在噪音的情况下也是如此。在这项工作中,我们争辩说,虽然对内插的调整是富有启发性和成果的,但许多真正的内插方法,如神经网络却不适得其次:在训练时,包括神经网络的适度噪音导致非零(但非无限)过度的试验性工作,意味着这些模型既不是良性的,也不是灾难性的,而是在中间体制中掉下来的。我们称这个中间制度是温和的,我们开始系统的研究。我们首先从内核(脊)回归(KR)的角度来探讨这种现象,通过获得关于脊部参数和内核线透透视系统对内核网络的每个经过训练的正的内核的内核研究,我们发现这些内核的内核的内核的内核的内核研究,我们发现,这些内核的内核的内核的内核的内核,我们发现一个更的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核的内核,我们发现更是更的内核的内核的内核的内核的内核。