双层 ReLU 网络的安装</s> (Benign Overfitting for Two-layer ReLU Networks)

Modern deep learning models with great expressive power can be trained to overfit the training data but still generalize well. This phenomenon is referred to as benign overfitting. Recently, a few studies have attempted to theoretically understand benign overfitting in neural networks. However, these works are either limited to neural networks with smooth activation functions or to the neural tangent kernel regime. How and when benign overfitting can occur in ReLU neural networks remains an open problem. In this work, we seek to answer this question by establishing algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk. Our result also reveals a sharp transition between benign and harmful overfitting under different conditions on data distribution in terms of test risk. Experiments on synthetic data back up our theory.

翻译：现代深层学习模式具有巨大的显性能,可以训练它们超额配置培训数据,但仍可以广泛推广。这一现象被称为良性超额配置。最近,一些研究试图从理论上理解神经网络中的良性超额配置。然而,这些工程要么局限于具有平稳激活功能的神经网络,要么局限于神经内核系统。ReLU神经网络中良性超额如何和何时发生仍是一个尚未解决的问题。在这项工作中,我们试图通过建立基于算法的风险界限来解决这一问题,学习带有标签反动噪音的两层ReLU神经网络。我们表明,在温和条件下,受梯度下降训练的神经网络可以达到近零培训损失和贝斯最佳测试风险。我们的结果还显示,在试验风险方面,在数据分配的不同条件下良性和有害性超标之间发生了急剧的转变。对合成数据进行实验可以支持我们的理论。</s>

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日