The deployment constraints in practical applications necessitate the pruning of large-scale deep learning models, i.e., promoting their weight sparsity. As illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the potential of improving their generalization ability. At the core of LTH, iterative magnitude pruning (IMP) is the predominant pruning method to successfully find 'winning tickets'. Yet, the computation cost of IMP grows prohibitively as the targeted pruning ratio increases. To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP. This raises the question of how to close the gap between pruning accuracy and pruning efficiency? To tackle it, we pursue the algorithmic advancement of model pruning. Specifically, we formulate the pruning problem from a fresh and novel viewpoint, bi-level optimization (BLO). We show that the BLO interpretation provides a technically-grounded optimization base for an efficient implementation of the pruning-retraining learning paradigm used in IMP. We also show that the proposed bi-level optimization-oriented pruning method (termed BiP) is a special class of BLO problems with a bi-linear problem structure. By leveraging such bi-linearity, we theoretically show that BiP can be solved as easily as first-order optimization, thus inheriting the computation efficiency. Through extensive experiments on both structured and unstructured pruning with 5 model architectures and 4 data sets, we demonstrate that BiP can find better winning tickets than IMP in most cases, and is computationally as efficient as the one-shot pruning schemes, demonstrating 2-7 times speedup over IMP for the same level of model accuracy and sparsity.
翻译:实际应用中的部署限制要求大规模深层次学习模型的修补, 即, 提升它们的重量宽度 。 正如Lottery Ticket pypothesis (LTH) 所显示的, 修补也有可能提高它们的概括能力。 在LTH 的核心, 迭代规模的修补( IMP) 是成功找到“ 得票” 的主要修补方法 。 然而, IMP 的计算成本随着目标的修补比率的提高而增长到令人望而却步。 为了降低计算管理管理, 已经开发了各种高效的“ 一手” 修补方法, 但这些计划通常无法找到像IMP那样好的优胜票。 这就提出了如何缩小修补准确性和裁剪剪裁效率之间的差距的问题 。 为了解决这个问题, 我们从新的和新颖的模型、 双级的优化(BLOO) 中, 我们的BLO解释提供了一个技术上的优化基础, 用于高效运行的ILO- 直径直线的精度的精度, 在IMP 的精度的精度的精度的精度的精度的精度的精度的精度的精度上, 在IMP 的精度的精度的精度的精度的精度的精度的精度上, 我们展示的机的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的机度的机度的计算方法上展示了一个的精度的机度, 我们的精度的精度的精度的精度的精度的精度的计算方法展示了一种细的精度的精度, 。