We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly reducing the number of function evaluations (NFEs) and improving the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling. Our method requires remarkably fewer forward and backward NFEs, is more accurate, and learns long-term dependencies more effectively than the other ODE-based neural network models. Code is available at \url{https://github.com/hedixia/HeavyBallNODE}.
翻译:我们建议使用重球神经普通差异方程式(HBNODE),利用传统加速梯度下降势头的持续极限,改进神经值(NODE)的培训和推断。HBNODE有两个属性,意味着比NODE具有实际优势:(一) HBNODE的连接状态也满足HBNODE, 加速前向和后向的ODE解答器,从而大大减少功能评价的数量,提高经过培训的模式的效用。 (二) HBNODE的频谱结构完善,能够从复杂的连续数据中有效地学习长期依赖性。我们核查HBNODE在基准任务(包括图像分类、学习复杂动态和顺序建模)方面的优势。我们的方法要求前向和后向的NFES要少得多,并且比其他基于ODE的神经网络模型更能有效地了解长期依赖性。代码可在\url{https://github.com/hedixia/HeavyBARONDE}。