We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data. We show that unbiased estimators have unacceptable performance for such nonlinear networks in this regime. We derive explicit generalization lower bounds for general biased estimators, in the cases of linear regression and of two-layered networks. In the linear case the bound is asymptotically tight. In the nonlinear case, we provide a comparison of our bounds with an empirical study of the stochastic gradient descent algorithm. The analysis uses elements from the theory of large random matrices.
翻译:我们在本文中研究了多层神经网络模型的概括误差的下限,即层的大小与培训数据样本数量相称的制度;我们表明,对这个制度中的非线性网络而言,公正的估计者的表现令人无法接受;我们为一般偏向估计者、线性回归和两层网络的情况,得出明确的概括误差的下限;在线性情况下,线性误差是微乎其微的。在非线性情况下,我们比较了我们的界限,对随机梯度梯度下行算法进行了实证研究。分析使用了大随机矩阵理论中的一些要素。