We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D, as well as linear multigrid methods. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets. These connections guarantee Euclidean stability of specific ResNets with a transposed convolution layer structure in each block. We present three numerical justifications for skip connections: as time discretisations in explicit schemes, as extrapolation mechanisms for accelerating those methods, and as recurrent connections in fixed point solvers for implicit schemes. Last but not least, we also motivate uncommon design choices such as nonmonotone activation functions. Our findings give a numerical perspective on the success of modern neural network architectures, and they provide design criteria for stable networks.
翻译:我们调查从将数字算法转换成神经网络中可以学到什么。 在数字方面,我们考虑在1D中采用明确、加速、明确和隐含的普通更高顺序的非线性扩散方程式计划,以及线性多电格方法。在神经网络方面,我们从残余网络(ResNets)、经常性网络和U-nets的角度确定相应的概念。这些连接保证了特定ResNet的ECLIDE稳定性,每个区块中都有一个变换的卷积层结构。我们提出了三个跳过连接的数字理由:作为明确计划中的时间分解,作为加速这些方法的外推法,以及作为隐性计划的固定点解答器的经常性连接。最后但并非最不重要的是,我们还激励了非常规的设计选择,如非分子激活功能。我们的调查结果为现代神经网络结构的成功提供了数字视角,并为稳定的网络提供了设计标准。