In this paper we analyze the $L_2$ error of neural network regression estimates with one hidden layer. Under the assumption that the Fourier transform of the regression function decays suitably fast, we show that an estimate, where all initial weights are chosen according to proper uniform distributions and where the weights are learned by gradient descent, achieves a rate of convergence of $1/\sqrt{n}$ (up to a logarithmic factor). Our statistical analysis implies that the key aspect behind this result is the proper choice of the initial inner weights and the adjustment of the outer weights via gradient descent. This indicates that we can also simply use linear least squares to choose the outer weights. We prove a corresponding theoretical result and compare our new linear least squares neural network estimate with standard neural network estimates via simulated data. Our simulations show that our theoretical considerations lead to an estimate with an improved performance. Hence the development of statistical theory can indeed improve neural network estimates.
翻译:在本文中,我们用一个隐藏的层分析神经网络回归估计值的$L_2美元错误。根据回归函数的Fourier变换假设,回归函数的Fourier变换会以适当快的速度衰减,我们显示,如果所有初始加权值都是根据适当的统一分布选择的,而加权值是通过梯度下降学得的,那么这一估计的趋同率为1美元/\\ sqrt{n}美元(最高为对数系数)。我们的统计分析表明,这一结果的关键方面是正确选择初始内重和通过梯度下降调整外重。这说明,我们也可以简单地使用线性最小的平方来选择外重。我们证明一个相应的理论结果,并将我们新的线性最小的神经网络估计值与模拟数据的标准神经网络估计值进行比较。我们的模拟表明,我们的理论考虑导致一种预测,其性能得到改进。因此,统计理论的发展确实可以改善神经网络的估计数。