Structured pruning methods are among the effective strategies for extracting small resource-efficient convolutional neural networks from their dense counterparts with minimal loss in accuracy. However, most existing methods still suffer from one or more limitations, that include 1) the need for training the dense model from scratch with pruning-related parameters embedded in the architecture, 2) requiring model-specific hyperparameter settings, 3) inability to include budget-related constraint in the training process, and 4) instability under scenarios of extreme pruning. In this paper, we present ChipNet, a deterministic pruning strategy that employs continuous Heaviside function and a novel crispness loss to identify a highly sparse network out of an existing dense network. Our choice of continuous Heaviside function is inspired by the field of design optimization, where the material distribution task is posed as a continuous optimization problem, but only discrete values (0 or 1) are practically feasible and expected as final outcomes. Our approach's flexible design facilitates its use with different choices of budget constraints while maintaining stability for very low target budgets. Experimental results show that ChipNet outperforms state-of-the-art structured pruning methods by remarkable margins of up to 16.1% in terms of accuracy. Further, we show that the masks obtained with ChipNet are transferable across datasets. For certain cases, it was observed that masks transferred from a model trained on feature-rich teacher dataset provide better performance on the student dataset than those obtained by directly pruning on the student data itself.
翻译:结构性调整方法是从密集的神经神经网络中从精密的神经神经网络中提取资源效率低且损失最小的有效战略之一,然而,大多数现有方法仍然受到一个或多个限制,包括:(1) 需要用建筑中嵌入的修剪相关参数从零开始对密集模型进行培训,(2) 需要模型特定的超光度设置,(3) 无法将预算方面的制约因素纳入培训过程,(4) 极端修剪情景下不稳定。本文介绍了芯片Net,这是一个确定性操纵的修剪策略,它使用连续的希维赛边功能,并有新的断层损失,以找出现有密集网络中高度分散的网络。我们选择连续的希维赛边功能,是受设计优化启发的,而材料分配任务是一个连续的优化问题,但只有离散值(0或1)才实际可行,并且预期是最终结果。我们的方法在学生选择预算限制的不同模式下,同时维持非常低的目标预算的稳定。实验结果显示,奇普网在学生状态上超越了新结构化的网络,从现有密集网络中发现高度分散的网络功能网络功能。我们通过惊人的精确度展示了16项数据的精确度展示了某些数据。