In this paper, we study the generalization performance of min $\ell_2$-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. We show that, depending on the ground-truth function, the test error of overfitted NTK models exhibits characteristics that are different from the "double-descent" of other overparameterized linear models with simple Fourier or Gaussian features. Specifically, for a class of learnable functions, we provide a new upper bound of the generalization error that approaches a small limiting value, even when the number of neurons $p$ approaches infinity. This limiting value further decreases with the number of training samples $n$. For functions outside of this class, we provide a lower bound on the generalization error that does not diminish to zero even when $n$ and $p$ are both large.
翻译:在本文中,我们研究了使用RELU激活的两层神经网络的神经相近内核模型(NTK)的超常性能,该模型没有偏差术语。我们根据地面真相功能,过装NTK模型的测试错误显示了不同于具有简单Fourier或Gaussian特性的其他超分光线模型的“双光”特征。具体地说,对于一类可学习功能,我们提供了一个新的一般性错误的上限,它接近一个小的有限值,即使神经元数量接近美元无限制值。这种限制值随着培训样品的数量进一步下降,美元。对于这一类以外的功能,我们提供了更低的通用性错误的界限,即使美元和美元都是大的,但一般性错误并没有减为零。