Channel pruning is a promising technique to compress the parameters of deep convolutional neural networks(DCNN) and to speed up the inference. This paper aims to address the long-standing inefficiency of channel pruning. Most channel pruning methods recover the prediction accuracy by re-training the pruned model from the remaining parameters or random initialization. This re-training process is heavily dependent on the sufficiency of computational resources, training data, and human interference(tuning the training strategy). In this paper, a highly efficient pruning method is proposed to significantly reduce the cost of pruning DCNN. The main contributions of our method include: 1) pruning compensation, a fast and data-efficient substitute of re-training to minimize the post-pruning reconstruction loss of features, 2) compensation-aware pruning(CaP), a novel pruning algorithm to remove redundant or less-weighted channels by minimizing the loss of information, and 3) binary structural search with step constraint to minimize human interference. On benchmarks including CIFAR-10/100 and ImageNet, our method shows competitive pruning performance among the state-of-the-art retraining-based pruning methods and, more importantly, reduces the processing time by 95% and data usage by 90%.
翻译:频道运行是压缩深卷动神经网络参数和加速推断的一个很有希望的技术。 本文旨在解决频道运行长期效率低下的问题。 大部分频道运行方法通过从剩余参数或随机初始化中再培训调整模型,恢复预测准确性。 这一再培训过程在很大程度上取决于计算资源、 培训数据和人类干扰的充足性( 调整培训战略) 。 本文建议一种高效的运行方法, 以大幅降低运行DCNN 的成本。 我们方法的主要贡献包括:1) 运行补偿, 快速和数据高效地替代再培训,以尽量减少运行后重建功能的损失,2) 补偿性调整( CaP), 一种新颖的运行算法, 以尽可能减少信息损失来消除多余或重量较低的渠道, 3) 以步骤限制来尽量减少人类干扰。 在基准上, 包括 CIRA- 10-100 和图像网络, 我们的方法展示了有竞争力的运行时间性, 通过州级再培训, 降低了95 数据再利用率, 降低了州级 90 的运行率, 降低了州级再培训, 降低了州级 90 和州级 降低 数据再利用率 降低了 。