In this work, a novel and model-based artificial neural network (ANN) training method is developed supported by optimal control theory. The method augments training labels in order to robustly guarantee training loss convergence and improve training convergence rate. Dynamic label augmentation is proposed within the framework of gradient descent training where the convergence of training loss is controlled. First, we capture the training behavior with the help of empirical Neural Tangent Kernels (NTK) and borrow tools from systems and control theory to analyze both the local and global training dynamics (e.g. stability, reachability). Second, we propose to dynamically alter the gradient descent training mechanism via fictitious labels as control inputs and an optimal state feedback policy. In this way, we enforce locally $\mathcal{H}_2$ optimal and convergent training behavior. The novel algorithm, \textit{Controlled Descent Training} (CDT), guarantees local convergence. CDT unleashes new potentials in the analysis, interpretation, and design of ANN architectures. The applicability of the method is demonstrated on standard regression and classification problems.
翻译:在这项工作中,开发了一种新型和基于模型的人工神经网络(ANN)培训方法,得到了最佳控制理论的支持。该方法扩大了培训标签,以有力保证培训损失趋同,并提高培训趋同率。在控制培训损失趋同的梯度下降培训框架内,提出了动态标签增强建议。首先,我们借助实验性神经中下层内核(NTK),从系统和控制理论中借用工具,分析当地和全球培训动态(如稳定性、可达性)。第二,我们提议通过虚拟标签作为控制投入和最佳国家反馈政策来动态改变梯度下降培训机制。在这种方式中,我们在当地执行$\mathcal{H<unk> 2$的最佳和趋同的培训行为。新的算法,\textit{控制源培训}(CDT),保证地方趋同。CDT释放了分析、解释和设计ANN结构的新潜力。该方法的适用性在标准回归和分类问题上得到了证明。</s>