无线性地穿透穿透: 由梯度下层训练的神经网络分类器, 用于 Noisy 线性数据 (Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data)

Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient descent. To better understand this empirical observation, we consider the generalization error of two-layer neural networks trained to interpolation by gradient descent on the logistic loss following random initialization. We assume the data comes from well-separated class-conditional log-concave distributions and allow for a constant fraction of the training labels to be corrupted by an adversary. We show that in this setting, neural networks exhibit benign overfitting: they can be driven to zero training error, perfectly fitting any noisy training labels, and simultaneously achieve test error close to the Bayes-optimal error. In contrast to previous work on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynamics are fundamentally nonlinear.

翻译：位于偏僻处的超光层,即那些在数据吵闹的情况下对模型进行全面综合的内插模型,首先在经过梯度下降训练的神经网络模型中观察到。为了更好地了解这一经验性观察,我们考虑了在随机初始化后,通过梯度下降对后勤损失进行内插而受过训练的两层神经网络的普遍误差。我们假设数据来自分离的单级有条件对线和内核的分布,并允许对手不断腐蚀培训标签的一部分。我们表明,在这个环境中,神经网络表现出无害的过度:它们可以被驱动到零训练错误,完全适合任何吵闹的培训标签,同时在靠近贝亚最佳误差的地方实现试验错误。与以往关于需要线性或内核预测器的良性过大的工作相比,我们的分析处于一个模型和学习动态基本非线性的环境中。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日