Iterative Magnitude Pruning (IMP) is a network pruning method that repeats the process of removing weights with the least magnitudes and retraining the model. When visualizing the weight matrices of language models pruned by IMP, previous research has shown that a structured pattern emerges, wherein the resulting surviving weights tend to prominently cluster in a select few rows and columns of the matrix. Though the need for further research in utilizing these structured patterns for potential performance gains has previously been indicated, it has yet to be thoroughly studied. We propose SPUR (Structured Pattern pruning Using Regularization), a novel pruning mechanism that preemptively induces structured patterns in compression by adding a regularization term to the objective function in the IMP. Our results show that SPUR can significantly preserve model performance under high sparsity settings regardless of the language or the task. Our contributions are as follows: (i) We propose SPUR, a network pruning mechanism that improves upon IMP regardless of the language or the task. (ii) We are the first to empirically verify the efficacy of "structured patterns" observed previously in pruning research. (iii) SPUR is a resource-efficient mechanism in that it does not require significant additional computations.
翻译:虽然先前曾指出需要进一步研究如何利用这些结构化模式取得潜在绩效收益,但尚未对此进行彻底研究。 我们提议了SPUR(结构化模式利用正规化进行调整),这是一个新型的调整机制,通过在IMP目标功能中增加一个正规化术语,先先先先在压缩结构模式中引入结构化模式。 我们的研究结果显示,SPUR可以在高松散环境中显著保持模型性能,而不论语言或任务如何。 我们的贡献如下:(一) 我们提议了SPUR,这是一个网络调整机制,改进IMP,而不论语言或任务如何。 (二) 我们首先从经验上核查了以前观测到的SPUR“结构化模式”的效能,而在以前进行的资源计算过程中没有进行重大的效率。 (三) SPUR在以前观测到的“结构化模式”机制中并不需要大量的资源计算。