Benign的隐含偏见 (The Implicit Bias of Benign Overfitting)

from arxiv, A thoroughly rewritten version, extending the results from spherical Gaussians to general high-dimensional distributions, and characterizing the asymptotic structure of the min-norm and max-margin predictors in this framework. Also, various small improvements to the presentation

The phenomenon of benign overfitting, where a predictor perfectly fits noisy training data while attaining low expected loss, has received much attention in recent years, but still remains not fully understood beyond well-specified linear regression setups. In this paper, we provide several new results on when one can or cannot expect benign overfitting to occur, for both regression and classification tasks. We consider a prototypical and rather generic data model for benign overfitting of linear predictors, where an arbitrary input distribution of some fixed dimension $k$ is concatenated with a high-dimensional distribution. For linear regression which is not necessarily well-specified, we show that the minimum-norm interpolating predictor (that standard training methods converge to) is biased towards an inconsistent solution in general, hence benign overfitting will generally not occur. Moreover, we show how this can be extended beyond standard linear regression, by an argument proving how the existence of benign overfitting on some regression problems precludes its existence on other regression problems. We then turn to classification problems, and show that the situation there is much more favorable. Specifically, we prove that the max-margin predictor (to which standard training methods are known to converge in direction) is asymptotically biased towards minimizing a weighted squared hinge loss. This allows us to reduce the question of benign overfitting in classification to the simpler question of whether this loss is a good surrogate for the misclassification error, and use it to show benign overfitting in some new settings.

翻译：良性超配现象,即一个预测者完全适合杂乱的培训数据,同时又能达到预期的低损失水平,近年来受到了很多关注,但除了明确的线性回归设置之外,仍然没有完全完全理解。在本文件中,我们为回归和分类任务提供了若干新的结果,说明在何时可以或不能期望出现良性超配时,就回归和分类任务而言,我们提供了若干新的结果。我们考虑的是,在超配线性预测器方面,一种典型的和相当通用的数据模型,即任意的某一固定尺寸的美元输入分配与高维度分布相融合。对于不一定很精确的线性回归,我们表明,最起码的诺性内插预测(标准培训方法相趋一致的)偏向于一种不一致的解决方案,因此一般不会出现良性过度的调整。此外,我们通过论证某些回归问题的存在是否优于某些回归问题,从而排除其存在其他回归问题。我们然后转向分类问题,并表明那里的情况更加有利。具体地说,我们证明,某种最接近性预测性预测和最差的混合的预测(也就是,这种标准性培训方法是否最接近于我们所知道的正统化的排序的排序的分类,是比重性损失更接近于我们更接近于正统的排序的问题)。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日