We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods. For nonparametric regression problems, it is well known that many standard methods attain the minimax optimal rate of estimation errors for smooth functions, and thus, it is not straightforward to identify the theoretical advantages of DNNs. This study tries to fill this gap by considering the estimation for a class of non-smooth functions that have singularities on hypersurfaces. Our findings are as follows: (i) We derive the generalization error of a DNN estimator and prove that its convergence rate is almost optimal. (ii) We elucidate a phase diagram of estimation problems, which describes the situations where the DNNs outperform a general class of estimators, including kernel methods, Gaussian process methods, and others. We additionally show that DNNs outperform harmonic analysis based estimators. This advantage of DNNs comes from the fact that a shape of singularity can be successfully handled by their multi-layered structure.
翻译:我们对深神经网络(DNN)的表现优于其他标准方法的原因进行了微缩速率分析。对于非参数回归问题,众所周知,许多标准方法都达到了光滑功能的最小最大估计误差率,因此,确定DNN的理论优势并非直截了当。本研究试图通过考虑在超表层上具有独特性的非移动功能类别的估计来填补这一差距。我们的调查结果如下:(一)我们得出DNN的测算器的普遍误差,并证明其趋同率几乎是最佳的。 (二)我们绘制了一个测算问题的阶段图,其中描述了DNNNs在一般测算器(包括内核法、高斯进程法等)上优于一般测算器类别的情况。我们还表明,DNNs在基于超表层的测算器的波度分析上优于光度。DNNP的优势在于其多层结构能够成功处理单形图。