Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with representative datasets. Recently, an augmented framework has been developed to overcome some limitations that emerged in the application of the original framework. In this paper, we propose a new class of continuous-depth neural networks with delay, named Neural Delay Differential Equations (NDDEs). To compute the corresponding gradients, we use the adjoint sensitivity method to obtain the delayed dynamics of the adjoint. Differential equations with delays are typically seen as dynamical systems of infinite dimension that possess more fruitful dynamics. Compared to NODEs, NDDEs have a stronger capacity of nonlinear representations. We use several illustrative examples to demonstrate this outstanding capacity. Firstly, we successfully model the delayed dynamics where the trajectories in the lower-dimensional phase space could be mutually intersected and even chaotic in a model-free or model-based manner. Traditional NODEs, without any argumentation, are not directly applicable for such modeling. Secondly, we achieve lower loss and higher accuracy not only for the data produced synthetically by complex models but also for the CIFAR10, a well-known image dataset. Our results on the NDDEs demonstrate that appropriately articulating the elements of dynamical systems into the network design is truly beneficial in promoting network performance.
翻译:神经常微分方程(NODEs)是一种连续深度神经网络的框架,已广泛应用,并显示出卓越的处理代表性数据集能力。最近,人们开发了一个增强的框架,以克服原始框架的一些限制。在本文中,我们提出了一类带有时滞的连续深度神经网络,称为神经时滞微分方程(NDDEs)。为了计算相应的梯度,我们使用伴随灵敏度方法来获得伴随的延迟动力学。具有延迟的微分方程通常被视为具有更丰富动力学的无限维动力系统。与NODEs相比,NDDEs具有更强的非线性表示能力。我们使用几个示例来演示这种出色的能力。首先,我们成功地以无模型或基于模型的方式对延迟动力学进行建模,其中较低维相空间中的轨迹可以相互交叉甚至混沌。传统的NODEs没有任何论证,不能直接应用于此类建模。其次,我们不仅对由复杂模型产生的数据,而且对于众所周知的图像数据集CIFAR10都实现了更低的损失和更高的准确性。我们在NDDEs上的结果表明,适当地将动力系统的元素结合到网络设计中,真正有助于促进网络性能。