Deep neural networks are known to be difficult to train due to the instability of back-propagation. A deep \emph{residual network} (ResNet) with identity loops remedies this by stabilizing gradient computations. We prove a boosting theory for the ResNet architecture. We construct $T$ weak module classifiers, each contains two of the $T$ layers, such that the combined strong learner is a ResNet. Therefore, we introduce an alternative Deep ResNet training algorithm, \emph{BoostResNet}, which is particularly suitable in non-differentiable architectures. Our proposed algorithm merely requires a sequential training of $T$ "shallow ResNets" which are inexpensive. We prove that the training error decays exponentially with the depth $T$ if the \emph{weak module classifiers} that we train perform slightly better than some weak baseline. In other words, we propose a weak learning condition and prove a boosting theory for ResNet under the weak learning condition. Our results apply to general multi-class ResNets. A generalization error bound based on margin theory is proved and suggests ResNet's resistant to overfitting under network with $l_1$ norm bounded weights.
翻译:已知深心神经网络难以培训, 原因是后方通信的不稳定性。 深的 emph{ residual 网络( ResNet) (ResNet) 通过稳定梯度计算来提供身份环补救 。 我们证明ResNet 架构是一个促进理论 。 我们为 ResNet 结构构建了 $T 的微弱模块分类器。 我们建造了 $T 的 弱模块分类器, 每套都包含两个 $T 的 $T 的 弱模块分类器, 这样, 强的学习器就是一个 ResNet 。 因此, 我们引入了另一种深深ResNet 培训算法, 即 \ emph{ BoostRestNet}, 这在非差别化结构中特别合适 。 我们提议的算法仅仅需要 $T $T $( shallow ResNets ) 的连续培训。 我们证明, 培训错误会随着深度的 $T( emph{weak) 模块分类器的运行比 弱的基线要好得多。, 我们建议 ResNet 的抗力 $_ 受重的网络 。