Classical wisdom suggests that estimators should avoid fitting noise to achieve good generalization. In contrast, modern overparameterized models can yield small test error despite interpolating noise -- a phenomenon often called "benign overfitting" or "harmless interpolation". This paper argues that the degree to which interpolation is harmless hinges upon the strength of an estimator's inductive bias, i.e., how heavily the estimator favors solutions with a certain structure: while strong inductive biases prevent harmless interpolation, weak inductive biases can even require fitting noise to generalize well. Our main theoretical result establishes tight non-asymptotic bounds for high-dimensional kernel regression that reflect this phenomenon for convolutional kernels, where the filter size regulates the strength of the inductive bias. We further provide empirical evidence of the same behavior for deep neural networks with varying filter sizes and rotational invariance.
翻译:古老的智慧表明,估计者应该避免适切的噪音,以达到良好的概括化。 相反,现代超分模型尽管有内插噪音,却可能产生小试验错误 -- -- 一种经常被称为“基装过度”或“不协调的内插”的现象。 本文认为,内插无害的程度取决于测算者的感应偏差的强度,即测算者在多大程度上偏爱某种结构的解决方案:强的内导偏差防止无害的内插,而弱的内导偏差甚至需要适当的噪音来加以概括化。我们的主要理论结果为高维内核的回归建立了严格的非防御性界限,反映进化内核的这个现象,即过滤器大小能调节进化偏差的强度。我们进一步为过滤器大小和旋转不均的深神经网络提供了相同行为的实证。