The theoretical structure of deep neural network (DNN) has been clarified gradually. Imaizumi-Fukumizu (2019) and Suzuki (2019) clarified that the learning ability of DNN is superior to the previous theories when the target function is non-smooth functions. However, as far as the author is aware, none of the numerous works to date attempted to mathematically investigate what kind of DNN architectures really induce pointwise convergence of gradient descent (without any statistical argument), and this attempt seems to be closer to the practical DNNs. In this paper we restrict target functions to non-smooth indicator functions, and construct a deep neural network inducing pointwise convergence provided by mini-batch gradient descent process in ReLU-DNN.
翻译:Translated abstract:
深度神经网络(DNN)的理论结构逐渐被澄清。Imaizumi-Fukumizu(2019年)和Suzuki(2019年)证明,当目标函数为非平滑函数时,DNN的学习能力优于之前的理论。但据作者所知,迄今为止,众多的工作都没有试图在没有任何统计论证的情况下数学地研究到底是哪种DNN结构真正引起了梯度下降的点值收敛,而这种尝试看起来更接近实际的DNN。在本文中,我们将目标函数限制为非平滑的指示函数,并构建了一个在ReLU-DNN上通过 mini-batch 梯度下降过程引起点值收敛的深度神经网络。