We study the problem of estimating an unknown function from noisy data using shallow (single-hidden layer) ReLU neural networks. The estimators we study minimize the sum of squared data-fitting errors plus a regularization term proportional to the Euclidean norm of the network weights. This minimization corresponds to the common approach of training a neural network with weight decay. We quantify the performance (mean-squared error) of these neural network estimators when the data-generating function belongs to the space of functions of second-order bounded variation in the Radon domain. This space of functions was recently proposed as the natural function space associated with shallow ReLU neural networks. We derive a minimax lower bound for the estimation problem for this function space and show that the neural network estimators are minimax optimal up to logarithmic factors. We also show that this is a "mixed variation" function space that contains classical multivariate function spaces including certain Sobolev spaces and certain spectral Barron spaces. Finally, we use these results to quantify a gap between neural networks and linear methods (which include kernel methods). This paper sheds light on the phenomenon that neural networks seem to break the curse of dimensionality.
翻译:我们用浅层(单隐藏层) ReLU 神经网络来研究利用浅层(单隐藏层) ReLU 神经网络来估计一个未知功能的问题。 我们研究的测量器将平方数据适应错误和与网络重量的Euclidean规范成正比的正规化术语之和最小化。 最小化相当于对神经网络进行重量衰减培训的共同方法。 当数据生成功能属于Radon 域内第二顺序约束性变异功能的空间时, 我们量化这些神经网络观测器的性能( 平均偏差) 。 这个功能的空间最近被提议为与浅ReLU 神经网络相关的自然功能空间 。 我们为这个功能空间的估算器生成了一个最小化的最小值, 并显示神经网络估计器的微缩度最符合对数系数。 我们还显示这是一个“ 混合变异” 功能空间, 包含典型的多变异功能空间, 包括某些Sobolev 空间和某些光谱 Barron 空间。 最后, 我们用这些结果来量化神经网络和线性网络的断裂现象方法。