Surrogate Lagrangian Relaxation: 一种无需重新训练的深度神经网络剪枝路径 (Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning)

Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks. However, the typical three-stage pipeline significantly increases the overall training time. In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation, which is tailored to overcome difficulties caused by the discrete nature of the weight-pruning problem. We prove that our method ensures fast convergence of the model compression problem, and the convergence of the SLR is accelerated by using quadratic penalties. Model parameters obtained by SLR during the training phase are much closer to their optimal values as compared to those obtained by other state-of-the-art methods. We evaluate our method on image classification tasks using CIFAR-10 and ImageNet with state-of-the-art MLP-Mixer, Swin Transformer, and VGG-16, ResNet-18, ResNet-50 and ResNet-110, MobileNetV2. We also evaluate object detection and segmentation tasks on COCO, KITTI benchmark, and TuSimple lane detection dataset using a variety of models. Experimental results demonstrate that our SLR-based weight-pruning optimization approach achieves a higher compression rate than state-of-the-art methods under the same accuracy requirement and also can achieve higher accuracy under the same compression rate requirement. Under classification tasks, our SLR approach converges to the desired accuracy $3\times$ faster on both of the datasets. Under object detection and segmentation tasks, SLR also converges $2\times$ faster to the desired accuracy. Further, our SLR achieves high model accuracy even at the hard-pruning stage without retraining, which reduces the traditional three-stage pruning into a two-stage process. Given a limited budget of retraining epochs, our approach quickly recovers the model's accuracy.

翻译：网络剪枝是一种广泛使用的技术，可减少深度神经网络的计算成本和模型大小。然而，典型的三阶段流水线会显著增加整个训练时间。在本文中，我们开发了一种基于Surrogate Lagrangian relaxation的系统权重剪枝优化方法，旨在克服由权重剪枝问题的离散性引起的困难。我们证明了我们的方法确保了模型压缩问题的快速收敛，并且使用二次惩罚可以加速SLR的收敛。在训练阶段由SLR获得的模型参数比其他最先进的方法获得的模型参数更接近其最优值。我们基于CIFAR-10和ImageNet使用最先进的MLP-Mixer、Swin Transformer和VGG-16、ResNet-18、ResNet-50和ResNet-110、MobileNetV2评估我们的方法。我们还使用各种模型在COCO、KITTI基准和TuSimple车道检测数据集上评估对象检测和分割任务。实验结果表明，在相同的准确性要求下，我们基于SLR的权重剪枝优化方法实现了更高的压缩率，并且在相同的压缩率要求下，还可以实现更高的准确性。在分类任务下，我们的SLR方法在这两个数据集上收敛到所需的准确度要比传统方法快3倍。在对象检测和分割任务下，SLR也可以快速收敛到所需的准确度。此外，我们的SLR即使在硬剪枝阶段也能实现高模型准确度，无需重新训练，将传统的三阶段剪枝缩短为两阶段过程。在给定有限的重新训练时期预算的情况下，我们的方法可以快速恢复模型的准确度。