We study the problem of estimating an unknown function from noisy data using shallow ReLU neural networks. The estimators we study minimize the sum of squared data-fitting errors plus a regularization term proportional to the squared Euclidean norm of the network weights. This minimization corresponds to the common approach of training a neural network with weight decay. We quantify the performance (mean-squared error) of these neural network estimators when the data-generating function belongs to the second-order Radon-domain bounded variation space. This space of functions was recently proposed as the natural function space associated with shallow ReLU neural networks. We derive a minimax lower bound for the estimation problem for this function space and show that the neural network estimators are minimax optimal up to logarithmic factors. This minimax rate is immune to the curse of dimensionality. We quantify an explicit gap between neural networks and linear methods (which include kernel methods) by deriving a linear minimax lower bound for the estimation problem, showing that linear methods necessarily suffer the curse of dimensionality in this function space. As a result, this paper sheds light on the phenomenon that neural networks seem to break the curse of dimensionality.
翻译:我们用浅ReLU 神经网络网络来研究利用浅层ReLU 神经网络来估计一个未知功能的问题。 我们研究的测算器将平方数据适应错误和与网络重量的平方 Euclidean 规范成比例的正规化术语之和最小化。 最小化相当于对神经网络进行重量衰减培训的通用方法。 当数据生成功能属于二等Radon- 内存的受约束变异空间时, 我们量化这些神经网络天体测量器的性能( 平均差 ) 。 这个功能的空间最近被提议为与浅线性ReLU 神经网络有关的自然功能空间。 我们为这一功能空间的估算问题绘制了一个最小值, 显示线性方法对于这一功能空间的估算问题具有较低的约束, 并表明神经网络的测算器对于逻辑因素来说是最佳的。 这种微量速率可以不受维度诅咒的。 我们量化了神经网络和线性方法( 包括内核方法) 之间的明显差距, 我们通过为测测测测问题所需的线性小成一个小体, 显示线性方法必然会受到线性线性网络的诅咒。