In recent years, deep network pruning has attracted significant attention in order to enable the rapid deployment of AI into small devices with computation and memory constraints. Pruning is often achieved by dropping redundant weights, neurons, or layers of a deep network while attempting to retain a comparable test performance. Many deep pruning algorithms have been proposed with impressive empirical success. However, existing approaches lack a quantifiable measure to estimate the compressibility of a sub-network during each pruning iteration and thus may under-prune or over-prune the model. In this work, we propose PQ Index (PQI) to measure the potential compressibility of deep neural networks and use this to develop a Sparsity-informed Adaptive Pruning (SAP) algorithm. Our extensive experiments corroborate the hypothesis that for a generic pruning procedure, PQI decreases first when a large model is being effectively regularized and then increases when its compressibility reaches a limit that appears to correspond to the beginning of underfitting. Subsequently, PQI decreases again when the model collapse and significant deterioration in the performance of the model start to occur. Additionally, our experiments demonstrate that the proposed adaptive pruning algorithm with proper choice of hyper-parameters is superior to the iterative pruning algorithms such as the lottery ticket-based pruning methods, in terms of both compression efficiency and robustness.
翻译:近年来,剪枝深度网络已经引起了重要的关注,以便将人工智能快速部署到计算和存储有限的小型设备中。在试图保持相似测试性能的同时,通常通过舍弃深层网络的冗余权重、神经元或层来实现剪枝。已经提出了许多深度剪枝算法,具有令人印象深刻的实证成功。然而,现有方法缺乏可量化的措施来估计每次剪枝迭代期间子网络的可压缩性,因此可能会过剪枝或欠剪枝模型。在这项工作中,我们提出了PQ指数(PQI)来衡量深度神经网络的潜在可压缩性,利用这一点开发了一种稀疏感知自适应剪枝(SAP)算法。我们广泛的实验证实了假设,对于通用的剪枝过程,PQI在有效地正则化大型模型时首先降低,然后在其可压缩性达到限制时增加,该限制似乎对应于欠拟合的开始。随后,当模型崩溃并且模型的性能出现显着恶化时,PQI再次降低。此外,我们的实验表明,采用恰当的超参数选择的自适应剪枝算法比如lottery ticket-based剪枝方法等迭代剪枝算法在压缩效率和健壮性方面更为优越。