Channel pruning is one of the major compression approaches for deep neural networks. While previous pruning methods have mostly focused on identifying unimportant channels, channel pruning is considered as a special case of neural architecture search in recent years. However, existing methods are either complicated or prone to sub-optimal pruning. In this paper, we propose a pruning framework that adaptively determines the number of each layer's channels as well as the wights inheritance criteria for sub-network. Firstly, evaluate the importance of each block in the network based on the mean of the scaling parameters of the BN layers. Secondly, use the bisection method to quickly find the compact sub-network satisfying the budget. Finally, adaptively and efficiently choose the weight inheritance criterion that fits the current architecture and fine-tune the pruned network to recover performance. AdaPruner allows to obtain pruned network quickly, accurately and efficiently, taking into account both the structure and initialization weights. We prune the currently popular CNN models (VGG, ResNet, MobileNetV2) on different image classification datasets, and the experimental results demonstrate the effectiveness of our proposed method. On ImageNet, we reduce 32.8% FLOPs of MobileNetV2 with only 0.62% decrease for top-1 accuracy, which exceeds all previous state-of-the-art channel pruning methods. The code will be released.
翻译:频道运行是深层神经网络的主要压缩方法之一。 虽然先前的运行方法主要侧重于识别不重要的频道, 但近年来, 频道运行被视为神经结构搜索的一个特殊案例。 但是, 现有的方法要么复杂, 要么容易进行亚最佳的运行。 在本文中, 我们提议了一个运行框架, 适应性地决定每个层的频道数量以及子网络的Wights继承标准。 首先, 根据 BN 层缩放参数的平均值评估网络中每个街区的重要性。 其次, 使用双部分方法快速找到符合预算的紧凑子网络。 最后, 适应性地和高效地选择符合当前结构并微调经调整网络以恢复性能的重置标准。 AdaPruner允许快速、准确和高效地获得运行网络的连接网络数量以及子网络的初始化重量。 我们根据当前流行的CNN模型( VG、 ResNet、 MoveNetVV2 ), 将快速地找到符合预算的精度。 最后, 以实验性的结果将显示我们先前的 0.2 V 版本系统 的精确度, 将缩小了我们先前的方法。