Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications. We introduce BlockDrop, an approach that learns to dynamically choose which layers of a deep network to execute during inference so as to best reduce total computation without degrading prediction accuracy. Exploiting the robustness of Residual Networks (ResNets) to layer dropping, our framework selects on-the-fly which residual blocks to evaluate for a given novel image. In particular, given a pretrained ResNet, we train a policy network in an associative reinforcement learning setting for the dual reward of utilizing a minimal number of blocks while preserving recognition accuracy. We conduct extensive experiments on CIFAR and ImageNet. The results provide strong quantitative and qualitative evidence that these learned policies not only accelerate inference but also encode meaningful visual information. Built upon a ResNet-101 model, our method achieves a speedup of 20\% on average, going as high as 36\% for some images, while maintaining the same 76.4\% top-1 accuracy on ImageNet.
翻译:非常深的进化神经网络提供了极好的认知结果,然而它们的计算成本却限制了它们对许多现实世界应用的影响。我们引入了BlucDrop, 这种方法学会动态地选择在推断期间要执行的深网络的哪一层, 以便在不降低预测精确度的情况下, 最大限度地减少总计算。 利用残余网络的强力到层层的下降, 我们的框架选择了在飞行上的剩余区块, 来评估给定的新图像。 特别是, 由于事先经过培训的ResNet, 我们训练了一个政策网络, 在一个联动强化学习环境中, 使用最小数量的块来获得双重奖励, 同时保持识别准确性。 我们在CIFAR和图像网络上进行了广泛的实验。 研究结果提供了强有力的定量和定性证据, 证明这些所学的政策不仅加快了推论,而且还编码了有意义的视觉信息。 在ResNet- 101模型的基础上, 我们的方法平均达到20 ⁇, 达到某些图像的高度36 ⁇,, 同时保持图像网络上相同的76.4 ⁇ -1精确度。