两层线性网络中隐含的双层和本尼重叠之间的相互作用 (The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks)

The recent success of neural network models has shone light on a rather surprising statistical phenomenon: statistical models that perfectly fit noisy data can generalize well to unseen test data. Understanding this phenomenon of $\textit{benign overfitting}$ has attracted intense theoretical and empirical study. In this paper, we consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk when the covariates satisfy sub-Gaussianity and anti-concentration properties, and the noise is independent and sub-Gaussian. By leveraging recent results that characterize the implicit bias of this estimator, our bounds emphasize the role of both the quality of the initialization as well as the properties of the data covariance matrix in achieving low excess risk.

翻译：最近神经网络模型的成功暴露了一种令人惊讶的统计现象:完全适合吵闹数据的统计模型可以将大量数据概括为隐蔽的测试数据。理解美元(textit{benign overformatit})现象已经吸引了大量的理论和经验研究。在本文中,我们考虑将受过平方损失梯度流训练的两层线性神经网络内插起来,并在共变体满足亚加盟性和反集中特性、噪音是独立的和亚加盟的时,从过度风险中推开界限。通过利用这个估计者隐含的偏差的最近结果,我们的界限强调初始化质量以及数据共变矩阵对于实现低超风险的作用。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

《常微分方程》笔记，419页pdf

专知会员服务

76+阅读 · 2020年8月2日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日