重新思考ResNets:改进堆叠战略,制定高度有序计划 (Rethinking ResNets: Improved Stacking Strategies With High Order Schemes)

Various deep neural network architectures (DNNs) maintain massive vital records in computer vision. While drawing attention worldwide, the design of the overall structure lacks general guidance. Based on the relationship between DNN design and numerical differential equations, we performed a fair comparison of the residual design with higher-order perspectives. We show that the widely used DNN design strategy, constantly stacking a small design (usually 2-3 layers), could be easily improved, supported by solid theoretical knowledge and with no extra parameters needed. We reorganise the residual design in higher-order ways, which is inspired by the observation that many effective networks can be interpreted as different numerical discretisations of differential equations. The design of ResNet follows a relatively simple scheme, which is Euler forward; however, the situation becomes complicated rapidly while stacking. We suppose that stacked ResNet is somehow equalled to a higher-order scheme; then, the current method of forwarding propagation might be relatively weak compared with a typical high-order method such as Runge-Kutta. We propose HO-ResNet to verify the hypothesis of widely used CV benchmarks with sufficient experiments. Stable and noticeable increases in performance are observed, and convergence and robustness are also improved. Our stacking strategy improved ResNet-30 by 2.15 per cent and ResNet-58 by 2.35 per cent on CIFAR-10, with the same settings and parameters. The proposed strategy is fundamental and theoretical and can therefore be applied to any network as a general guideline.

翻译：在吸引全世界注意的同时,总体结构的设计缺乏一般指导。基于DNN设计与数字差异方程式之间的关系,我们对残余设计进行了公平的比较,将残余设计与高阶视角进行了比较。我们表明,广泛使用的DNN设计战略,不断堆叠一个小设计(通常为2-3层),可以很容易地加以改进,得到扎实的理论知识的支持,不需要额外的参数。我们以更高层次的标准重组剩余设计,这受到以下观察的启发:许多有效网络可以被解释为差异方程式的不同数字分解。ResNet的设计遵循一个相对简单的计划,这是Euler向前推进的;然而,情况在堆叠时变得很复杂。我们认为,堆叠的ResNet设计战略,在某种程度上相当于一个更高级的设计(通常为2-3层层),因此,目前的传播方法可能相对薄弱,而像Runge-Kutta这样的典型的高级排序方法。我们提议HO-ResNet来核查广泛使用的CV基准假设,并充分应用了差异方程式的参数。ResNet的设计遵循了一个比较简单和清晰的模型,因此,SFAR2号网络的收尾列和S-CRCRCRCRCS-C-C-C-C-CON-C-C-C-C-C-CON-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-