The question of what kind of convolutional neural network (CNN) structure performs well is fascinating. In this work, we move toward the answer with one more step by connecting zero stability and model performance. Specifically, we found that if a discrete solver of an ordinary differential equation is zero stable, the CNN corresponding to that solver performs well. We first give the interpretation of zero stability in the context of deep learning and then investigate the performance of existing first- and second-order CNNs under different zero-stable circumstances. Based on the preliminary observation, we provide a higher-order discretization to construct CNNs and then propose a zero-stable network (ZeroSNet). To guarantee zero stability of the ZeroSNet, we first deduce a structure that meets consistency conditions and then give a zero stable region of a training-free parameter. By analyzing the roots of a characteristic equation, we theoretically obtain the optimal coefficients of feature maps. Empirically, we present our results from three aspects: We provide extensive empirical evidence of different depth on different datasets to show that the moduli of the characteristic equation's roots are the keys for the performance of CNNs that require historical features; Our experiments show that ZeroSNet outperforms existing CNNs which is based on high-order discretization; ZeroSNets show better robustness against noises on the input. The source code is available at \url{https://github.com/LongJin-lab/ZeroSNet}.
翻译:什么样的 convolutional神经网络(CNN) 结构表现良好是令人着迷的。 在这项工作中,我们通过将零稳定性和模型性能连接到零稳定性和模型性能,以更远一步的方式向答案迈进。 具体地说,我们发现,如果普通差分方程式的离散求解器是零稳定性,那么与该求解器相对的CNN运行良好。 我们首先在深层次学习的背景下解释零稳定性,然后调查现有第一和第二级CNN的运行情况。 根据初步观察,我们提供了建造CNN的更高层次分解,然后提出了一个零稳定的网络(ZeroSNet 网络 ) 。 为了保证ZeroSNet的零稳定性,我们首先推断出一个符合一致性条件的结构,然后给出一个零稳定的无培训参数区域。 通过分析特征性平方程式的根源,我们从理论上获取了地貌地图的最佳输入系数。 我们从三个方面展示了我们的结果: 我们从不同深度上提供了广泛的实验证据, 以显示不同数据集的Mextuli(Mulia) 根部的根部是Z的精度。 我们的SNISURZ的高级实验显示, 显示SONZ的高级的状态是基于历史的高级的特征显示, 显示SISNWA- real- stortial-stal-stal-stalal-smaxxxxxxxxxxxxxx的高度的高度的高度的高度显示,这是基于历史的高度的特征的特征的特征显示以历史的高度。