We consider a sparse deep ReLU network (SDRN) estimator obtained from empirical risk minimization with a Lipschitz loss function in the presence of a large number of features. Our framework can be applied to a variety of regression and classification problems. The unknown target function to estimate is assumed to be in a Korobov space. Functions in this space only need to satisfy a smoothness condition rather than having a compositional structure. We develop non-asymptotic excess risk bounds for our SDRN estimator. We further derive that the SDRN estimator can achieve the same minimax rate of estimation (up to logarithmic factors) as one-dimensional nonparametric regression when the dimension of the features is fixed, and the estimator has a suboptimal rate when the dimension grows with the sample size. We show that the depth and the total number of nodes and weights of the ReLU network need to grow as the sample size increases to ensure a good performance, and also investigate how fast they should increase with the sample size. These results provide an important theoretical guidance and basis for empirical studies by deep neural networks.
翻译:我们认为,从实验风险最小化中获得的稀薄深ReLU网络(SDRN)估计仪,具有利普西茨损失函数,具有大量特性。我们的框架可以适用于各种回归和分类问题。估计的未知目标功能假定在Korobov空间。这一空间的功能只需要满足一个平滑状态,而不必有一个组成结构。我们为我们的SDRNspestmator开发了非无药性超重风险界限。我们进一步发现,SDRN估计仪可以达到同样的微缩估计率(最高为对数系数),作为单维非对数回归,当特征的尺寸固定时,而且当尺寸随着样本大小的增长,估计仪有一个亚最佳率。我们表明,RLU网络的深度和总节点和重量需要随着样本大小的增加而增长,以确保良好的性能,并且还要调查它们与样本尺寸的增加速度。这些结果为深层神经网络的经验研究提供了重要的理论指导和基础。