Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 dataset.
翻译:神经网络因其在许多应用中的有效性而引起了很大的兴趣。 但是,它们的数学特性一般没有很好地理解。 如果数据或功能的近似性具有某些内在的几何结构,那么在设计神经网络时往往有必要考虑到这一点。在这项工作中,我们从一个非自主的 ODE 开始,利用一个合适的、结构保护的、数字的时间分解系统来建立神经网络。然后从 ODE 矢量场的特性中推断出神经网络的结构。除了向网络结构注入更多的结构外,这个建模程序还可以使人们从理论上更好地了解它们的行为。我们提出两个普遍近似结果,并展示如何将某些特定特性强加于神经网络。一个特别重点是1-Lipschitz 结构,包括不是1-Lipschitz的层。这些网络对对抗性攻击的描述是明确和有力的,如 CIRA-10数据集所示。