重新思考ResNets:改进堆叠战略,制定高度有序计划 (Rethinking ResNets: Improved Stacking Strategies With High Order Schemes)

Various Deep Neural Network architectures are keeping massive vital records in computer vision. While drawing attention worldwide, the design of the overall structure somehow lacks general guidance. Based on the relationship between DNN design with numerical differential equations, which several researchers observed in recent years, we perform a fair comparison of residual design with higher-order perspectives. We show that the widely used DNN design strategy, constantly stacking a small design, could be easily improved, supported by solid theoretical knowledge and no extra parameters needed. We reorganize the residual design in higher-order ways, which is inspired by the observation that many effective networks could be interpreted as different numerical discretizations of differential equations. The design of ResNet follows a relatively simple scheme which is Euler forward; however, the situation is getting complicated rapidly while stacking. We suppose stacked ResNet is somehow equalled to a higher order scheme, then the current way of forwarding propagation might be relatively weak compared with a typical high-order method like Runge-Kutta. We propose higher order ResNet to verify the hypothesis on widely used CV benchmarks with sufficient experiments. Stable and noticeable rises in performance are observed, convergence and robustness are benefited.

翻译：各种深神经网络结构在计算机视野中保留了巨大的重要记录。在全世界范围内,总体结构的设计在提醒人们注意时,总结构的设计在某种程度上缺乏一般的指导。基于DNN设计与数字差异方程式之间的关系(一些研究人员近年来观察到了这种关系),我们对残余设计与高阶视角进行了公平的比较。我们表明,广泛使用的DNN设计战略,不断堆叠一个小设计,可以很容易地加以改进,并得到扎实的理论知识和不需要的额外参数的支持。我们以更高层次的方法重组剩余设计,这受到以下观察的启发:许多有效的网络可以被解释为不同方程式的不同数字分解。ResNet的设计遵循了一个相对简单的计划,这个计划是Euler向前推进的;然而,情况正在迅速复杂化。我们假设堆叠的ResNet与更高的顺序计划相当,因此目前的传播方式可能相对薄弱,而像Runge- Kutta这样的典型的高层次方法则不必要。我们提议采用更高的顺序ResNet来核查广泛使用的CV基准的假设。我们观察到了业绩的一致和稳健。