Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise convolution and feature pyramid network, where channels of multiple layers are coupled and need to be pruned simultaneously. In this paper, we present a general channel pruning approach that can be applied to various complicated structures. Particularly, we propose a layer grouping algorithm to find coupled channels automatically. Then we derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Moreover, we find that inference speedup on GPUs is more correlated with the reduction of memory rather than FLOPs, and thus we employ the memory reduction of each channel to normalize the importance. Our method can be used to prune any structures including those with coupled channels. We conduct extensive experiments on various backbones, including the classic ResNet and ResNeXt, mobile-friendly MobileNetV2, and the NAS-based RegNet, both on image classification and object detection which is under-explored. Experimental results validate that our method can effectively prune sophisticated networks, boosting inference speed without sacrificing accuracy.
翻译:对网络压缩进行了广泛的研究,因为它能够减少内存和计算在推论期间的成本,然而,以前的方法很少涉及复杂的结构,如残余连接、群/深度进化和地貌金字塔网络等复杂结构,多层的渠道是同时连接的,需要同时修剪。在本文中,我们提出了一个可以应用于各种复杂结构的一般频道修剪方法。特别是,我们提议了一个层分组算法,以自动找到连接的渠道。然后,我们根据渔业信息得出一个统一的衡量标准,以评估单一频道和连接频道的重要性。此外,我们发现GPUs的加速推论与记忆的减少而不是FLOPs的关系更大,因此我们采用了每个频道的内存减少使重要性正常化的方法。我们的方法可以用来处理任何结构,包括连接渠道的结构。我们在各种骨干上进行广泛的实验,包括经典的ResNet和ResNeXt、方便移动式的移动网络2,以及基于NAS RegNet, 两者都涉及图像分类和对象探测,而这种分析不到速度。实验结果证实我们的方法能够有效地复制精密的网络。