在两层革命神经网络中进行渗透改造 (Benign Overfitting in Two-layer Convolutional Neural Networks)

Modern neural networks often have great expressive power and can be trained to overfit the training data, while still achieving a good test performance. This phenomenon is referred to as "benign overfitting". Recently, there emerges a line of works studying "benign overfitting" from the theoretical perspective. However, they are limited to linear models or kernel/random feature models, and there is still a lack of theoretical understanding about when and how benign overfitting occurs in neural networks. In this paper, we study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN). We show that when the signal-to-noise ratio satisfies a certain condition, a two-layer CNN trained by gradient descent can achieve arbitrarily small training and test loss. On the other hand, when this condition does not hold, overfitting becomes harmful and the obtained CNN can only achieve constant level test loss. These together demonstrate a sharp phase transition between benign overfitting and harmful overfitting, driven by the signal-to-noise ratio. To the best of our knowledge, this is the first work that precisely characterizes the conditions under which benign overfitting can occur in training convolutional neural networks.

翻译：现代神经网络通常具有巨大的表达力,可以接受培训,以过度配置培训数据,同时仍能取得良好的测试性能。这一现象被称为“居心超额 ” 。最近,出现了从理论角度研究“ 居心超额” 特征模型的一连串工作。但是,它们仅限于线性模型或内核/神经特质模型,对于神经网络何时和如何出现良性超标仍然缺乏理论上的理解。在本文中,我们在培训两层革命性神经网络(CNN)时研究适配现象。我们发现,当信号-噪音比率达到一定条件时,由梯度下降所训练的两层CNN可实现任意的小型培训和测试损失。另一方面,如果这一条件不能维持下去,过度装配术就会变得有害,而获得的CNN只能达到持续水平测试性损失。这些共同表明,在良性超标和有害超配之间,由信号-噪音比率所驱动,我们所了解的最好的是,这是在革命性网络中准确描述良性超载条件的第一件工作。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日