过度参数化制度中的分类与回归:损失功能是否重要? (Classification vs regression in overparameterized regimes: Does the loss function matter?)

We compare classification and regression tasks in an overparameterized linear model with Gaussian features. On the one hand, we show that with sufficient overparameterization all training points are support vectors: solutions obtained by least-squares minimum-norm interpolation, typically used for regression, are identical to those produced by the hard-margin support vector machine (SVM) that minimizes the hinge loss, typically used for training classifiers. On the other hand, we show that there exist regimes where these interpolating solutions generalize well when evaluated by the 0-1 test loss function, but do not generalize if evaluated by the square loss function, i.e. they approach the null risk. Our results demonstrate the very different roles and properties of loss functions used at the training phase (optimization) and the testing phase (generalization).

翻译：一方面,我们表明,如果足够多的多参数化,所有培训点都是辅助矢量:通常用于回归的最小平方最小中度内插法获得的解决方案与硬边支持矢量机(SVM)产生的解决方案相同,硬边支持矢量机(SVM)产生的解决方案可以最大限度地减少断层损失,通常用于培训分类师。另一方面,我们表明,存在这样的制度,即这些内插解决方案在用0-1测试损失函数评估时非常普遍,但如果用平方损失函数来评估,则不普遍化,即它们接近完全风险。我们的结果显示了在培训阶段(优化)和测试阶段(一般化)使用的损失函数的不同作用和特性。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日