In this paper, a simple yet effective network pruning framework is proposed to simultaneously address the problems of pruning indicator, pruning ratio, and efficiency constraint. This paper argues that the pruning decision should depend on the convolutional weights, and thus proposes novel weight-dependent gates (W-Gates) to learn the information from filter weights and obtain binary gates to prune or keep the filters automatically. To prune the network under efficiency constraints, a switchable Efficiency Module is constructed to predict the hardware latency or FLOPs of candidate pruned networks. Combined with the proposed Efficiency Module, W-Gates can perform filter pruning in an efficiency-aware manner and achieve a compact network with a better accuracy-efficiency trade-off. We have demonstrated the effectiveness of the proposed method on ResNet34, ResNet50, and MobileNet V2, respectively achieving up to 1.33/1.28/1.1 higher Top-1 accuracy with lower hardware latency on ImageNet. Compared with state-of-the-art methods, W-Gates also achieves superior performance.
翻译:本文提出一个简单而有效的网络运行框架,以同时解决裁剪指标、裁剪比率和效率限制等问题。本文认为,裁剪决定应取决于进化重量,因此提议采用新的重控门(W-Gates),从过滤重量中学习信息,并获得二进制门,以便进行精选或自动保留过滤器。为了在效率限制下缩小网络,将构建一个可转换效率模块,以预测候选经裁剪网络的硬件长度或FLOP。与拟议的效率模块相结合,W-Gates可以以高效觉悟的方式进行过滤运行,实现精密性更佳的压缩网络。我们已经展示了ResNet34、ResNet50和移动网络V2的拟议方法的有效性,分别达到了1.33/1.28/11.1,图像网络上高顶部-1的精度,而硬件长度更低。与最新的方法相比,W-Gates还取得了更高的绩效。