Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework due to their ability to adapt to new problems automatically. Training a neural differential equation is effectively a search over a space of plausible dynamical systems. However, controlling the computational cost for these models is difficult since it relies on the number of steps the adaptive solver takes. Most prior works have used higher-order methods to reduce prediction timings while greatly increasing training time or reducing both training and prediction timings by relying on specific training algorithms, which are harder to use as a drop-in replacement due to strict requirements on automatic differentiation. In this manuscript, we use internal cost heuristics of adaptive differential equation solvers at stochastic time points to guide the training toward learning a dynamical system that is easier to integrate. We "close the black-box" and allow the use of our method with any adjoint technique for gradient calculations of the differential equation solution. We perform experimental studies to compare our method to global regularization to show that we attain similar performance numbers without compromising the flexibility of implementation on ordinary differential equations (ODEs) and stochastic differential equations (SDEs). We develop two sampling strategies to trade off between performance and training time. Our method reduces the number of function evaluations to 0.556-0.733x and accelerates predictions by 1.3-2x.
翻译:神经差异等深层隐含层深学习技术,如神经差异等,由于能够自动适应新的问题,已成为一个重要的模型框架。培训神经差异方程式实际上是搜索一个貌似有活力系统的空间。然而,控制这些模型的计算成本是困难的,因为它依赖于适应求解器所采取步骤的数量。大多数先前的工程都使用了更高级的方法来减少预测时间,同时极大地增加培训时间,或通过依靠具体的培训算法来减少培训和预测时间。由于对自动差异的严格要求,这些算法更难用作一个滴入替代。在本手稿中,我们使用适应性差异方程式的内在成本超高性能在随机时间点上指导培训如何学习一个较易整合的动态系统。我们“关闭黑箱”并允许使用我们的方法与计算差异方程式溶法梯度的任何关联技术。我们进行实验性研究,以比较我们的方法与全球正规化的方法,以显示我们在普通差异方程式(ODIs)的实施灵活性下取得相似的业绩数字。我们使用适应差异方程式内部差异方程式的内部成本超高,而我们采用双变式分析-变式分析公式的计算方法,从而降低了我们之间的进度的进度。我们通过我们之间的进度变式分析-变式计算法-变式计算方法,从而降低了我们的变式分析方法。</s>