深神经网络的非参数回归 (Robust Nonparametric Regression with Deep Neural Networks)

from arxiv, Guohao Shen and Yuling Jiao contributed equally to this work. Corresponding authors: Yuanyuan Lin (Email: ylin@sta.cuhk.edu.hk) and Jian Huang (Email: jian-huang@uiowa.edu). arXiv admin note: substantial text overlap with arXiv:2104.06708

In this paper, we study the properties of robust nonparametric estimation using deep neural networks for regression models with heavy tailed error distributions. We establish the non-asymptotic error bounds for a class of robust nonparametric regression estimators using deep neural networks with ReLU activation under suitable smoothness conditions on the regression function and mild conditions on the error term. In particular, we only assume that the error distribution has a finite p-th moment with p greater than one. We also show that the deep robust regression estimators are able to circumvent the curse of dimensionality when the distribution of the predictor is supported on an approximate lower-dimensional set. An important feature of our error bound is that, for ReLU neural networks with network width and network size (number of parameters) no more than the order of the square of the dimensionality d of the predictor, our excess risk bounds depend sub-linearly on d. Our assumption relaxes the exact manifold support assumption, which could be restrictive and unrealistic in practice. We also relax several crucial assumptions on the data distribution, the target regression function and the neural networks required in the recent literature. Our simulation studies demonstrate that the robust methods can significantly outperform the least squares method when the errors have heavy-tailed distributions and illustrate that the choice of loss function is important in the context of deep nonparametric regression.

翻译：在本文中,我们研究强度非对称估测的特性,使用深神经网络来研究重尾误差分布的回归模型的深神经网络。我们为使用深神经网络,在适当的平稳条件下,在回归函数和误差术语的温和条件下,使用RELU激活的深神经网络,为一类强度非对称回归估计值设定非无损误差界限。特别是,我们仅假设误差分布有一个有限的pth时刻,其次线值大于1。我们还表明,当预测器的分布得到大约低维度数据集的支持时,深强的回归估计值能够绕过维度的诅咒。我们误差的其中一个重要特征是,对于具有网络宽度和网络大小(参数数量)的ReLU神经网络来说,只有预测器的方形的顺序,我们的超重风险界限取决于 d 。我们的假设放松了精确的多重支持假设,在实践中可能是限制性和不现实的。我们还放松了有关数据分布的几项关键假设,在低维度数据集中,目标回归函数的回归函数以及我们最近进行模拟的平整型分析的平方法研究时,可以证明。