Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time. To restore accuracy after pruning, fine-tuning is usually applied to pruned networks. However, too few remaining parameters in pruned networks inevitably bring a great challenge to fine-tuning to restore accuracy. To address this challenge, we propose a novel method that first linearly over-parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters and then re-parameterizes them to the original layers after fine-tuning. Specifically, we equivalently expand the convolution/linear layer with several consecutive convolution/linear layers that do not alter the current output feature maps. Furthermore, we utilize similarity-preserving knowledge distillation that encourages the over-parameterized block to learn the immediate data-to-data similarities of the corresponding dense layer to maintain its feature learning ability. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet which significantly outperforms the vanilla fine-tuning strategy, especially for large pruning ratio.
翻译:为了应对这一挑战,我们提出了一个新颖的方法,即通过减少通道(过滤器)来降低快速发酵速度和运行时足迹低的通道(过滤器)来压缩压缩神经网络。为了在运行时恢复准确性,通常对运行后的网络进行微调。然而,在运行后的网络中,只有极少的剩余参数必然会给微调带来巨大的挑战,以便恢复准确性。为了应对这一挑战,我们提出了一个新颖的方法,即首先线性地过宽的线性过宽度将小网络中的紧凑层放大,以扩大微调参数的数量,然后在微调后将这些参数重新调整到原始层。具体地说,我们相当地将演进/线性层扩大成若干连续的相变进/线层,而不会改变目前的输出特征图。此外,我们利用类似性保留知识蒸馏法,鼓励过量的区块学习相应的密度层的即时数据-数据相似性,以保持其特征学习能力。拟议的方法在CIFAR-10和图像Net上进行了全面评价,大大超过香草微调战略,特别是大压率率率比率。