This paper presents a mathematical analysis of ODE-Net, a continuum model of deep neural networks (DNNs). In recent years, Machine Learning researchers have introduced ideas of replacing the deep structure of DNNs with ODEs as a continuum limit. These studies regard the "learning" of ODE-Net as the minimization of a "loss" constrained by a parametric ODE. Although the existence of a minimizer for this minimization problem needs to be assumed, only a few studies have investigated its existence analytically in detail. In the present paper, the existence of a minimizer is discussed based on a formulation of ODE-Net as a measure-theoretic mean-field optimal control problem. The existence result is proved when a neural network, which describes a vector field of ODE-Net, is linear with respect to learnable parameters. The proof employs the measure-theoretic formulation combined with the direct method of Calculus of Variations. Secondly, an idealized minimization problem is proposed to remove the above linearity assumption. Such a problem is inspired by a kinetic regularization associated with the Benamou--Brenier formula and universal approximation theorems for neural networks. The proofs of these existence results use variational methods, differential equations, and mean-field optimal control theory. They will stand for a new analytic way to investigate the learning process of deep neural networks.
翻译:本文提供了对ODE-Net的数学分析,它是深度神经网络(DNN)的连续模型。近年来,机器学习研究人员提出了用ODE作为连续极限替换DNN的深度结构的想法。这些研究将ODE-Net的“学习”视为在参数ODE约束下的“损失”最小化。虽然需要假设这个最小化问题有最小化解,但只有少数研究详细地分析了它的存在性。在本文中,将ODE-Net的变分公式作为一种测度论的均场最优控制问题进行分析,并探讨了解决方案的存在性。当神经网络线性描述ODE-Net的向量场时,可以证明解的存在性。证明采用测度论公式和变分法的直接方法。其次,为了消除上述线性假设,提出了一种理想化的最小化问题,受Benamou-Brenier公式的动力学正则化和神经网络的通用逼近定理的启发。这些存在结果的证明使用变分方法,微分方程和均场最优控制理论。它们将为研究深度神经网络的学习过程提供新的分析方式。