Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.
翻译:深神经网络(DNNS)培训是非线性动态系统的最佳控制问题,但近来受到相当重视,但算法发展仍然相对有限。在这项工作中,我们尝试从轨迹优化角度重新制定培训程序。我们首先显示,培训DNS最广泛使用的算法可以与差异动态程序(DDP)挂钩,后者是源于“近似动态程序”的第二阶轨迹优化算法,值得庆祝。本着这一精神,我们提出了一个新的DDP变式,它可以接受分批优化来培训进化网络,同时自然地与最近的曲线近似进展相结合。由此产生的算法具有层次反馈政策的特点,提高了趋同率,降低了对现有方法的超参数的敏感度。我们显示,这种算法与最先进的第一阶和第二阶法相比具有竞争力。我们的工作为基于最佳控制理论的有原则的有原则的算法设计开辟了新的途径。