Continuous-depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. The common solution is to use the adjoint sensitivity method to replicate a forward-backward pass optimisation problem. We propose a new approach which explicates the network's `depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems. This new method is based on the principle of `Invariant Imbedding' for which we prove a general solution, applicable to all non-linear, vector-valued optimal control problems with both running and terminal loss. Our new architectures provide a tangible tool for inspecting the theoretical--and to a great extent unexplained--properties of network depth. They also constitute a resource of discrete implementations of Neural ODEs comparable to classes of imbedded residual neural networks. Through a series of experiments, we show the competitive performance of the proposed architectures for supervised learning and time series prediction.
翻译:持续深入的神经网络,如Neural CODEs等,已经根据非线性矢量估值的最佳控制问题,重新确定了对残余神经网络的理解。共同的解决办法是使用联合灵敏度方法复制前向后后传的最优化问题。我们提出了一种新的方法,将网络的“深度”作为一个基本变量,从而将问题降低到一个具有前瞻性的初始价值问题系统。这一新的方法基于“内在嵌入式”原则,我们认为该原则适用于所有非线性、病媒估值的运行损失和终端损失的最佳控制问题。我们的新结构为检查网络深度的理论和在很大程度上无法解释的特性提供了一种有形工具。它们也构成了与内嵌入式残余神经网络等级相类似的独立实施神经观测仪的资源。通过一系列实验,我们展示了监督学习和时间序列预测的拟议结构的竞争性表现。