Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. However, these two techniques are traditionally deployed in an isolated manner, leading to significant accuracy drop when pursuing high compression rates. In this paper, we propose a Collaborative Compression (CC) scheme, which joints channel pruning and tensor decomposition to compress CNN models by simultaneously learning the model sparsity and low-rankness. Specifically, we first investigate the compression sensitivity of each layer in the network, and then propose a Global Compression Rate Optimization that transforms the decision problem of compression rate into an optimization problem. After that, we propose multi-step heuristic compression to remove redundant compression units step-by-step, which fully considers the effect of the remaining compression space (i.e., unremoved compression units). Our method demonstrates superior performance gains over previous ones on various datasets and backbone architectures. For example, we achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
翻译:频道的运行和电磁分解在进化神经网络压缩中引起了广泛的关注。 但是,这两种技术传统上都是以孤立的方式部署的,导致在追求高压缩率时的精度显著下降。 在本文中,我们提议了一个合作压缩(CC)计划,通过同时学习模型的广度和低级别,将频道的运行和电压分解用于压缩CNN模型。具体地说,我们首先调查网络中每个层的压缩灵敏度,然后提出全球压缩率压缩率优化,将压缩率的决定问题转化为优化问题。之后,我们提出多步超速压缩,以逐步去除冗余压缩器,其中充分考虑到剩余压缩空间(即未拆除的压缩器)的影响。我们的方法显示,在各种数据集和主干结构上,比以往的压缩器效果优。例如,我们通过删除ResNet-50上48.4%的参数,在2012年图像网络上仅下降了0.56%的顶端-精确度下降,从而实现了52.9%的FLOPs减排。