In this paper, we study the importance of pruning in Deep Networks (DNs) and the yin & yang relationship between (1) pruning highly overparametrized DNs that have been trained from random initialization and (2) training small DNs that have been "cleverly" initialized. As in most cases practitioners can only resort to random initialization, there is a strong need to develop a grounded understanding of DN pruning. Current literature remains largely empirical, lacking a theoretical understanding of how pruning affects DNs' decision boundary, how to interpret pruning, and how to design corresponding principled pruning techniques. To tackle those questions, we propose to employ recent advances in the theoretical analysis of Continuous Piecewise Affine (CPA) DNs. From this perspective, we will be able to detect the early-bird (EB) ticket phenomenon, provide interpretability into current pruning techniques, and develop a principled pruning strategy. In each step of our study, we conduct extensive experiments supporting our claims and results; while our main goal is to enhance the current understanding towards DN pruning instead of developing a new pruning method, our spline pruning criteria in terms of layerwise and global pruning is on par with or even outperforms state-of-the-art pruning methods.
翻译:在本文中,我们研究了在深网络(DNs)和 yin & Yang 关系中裁剪的重要性:(1) 剪裁从随机初始化中培训出来的高度超分化的DNs,(2) 培训“精密”初始化的小型DNs,(2) 培训“精密”初始化的小型DNs。由于在多数情况下,从业人员只能随机初始化,因此非常需要形成对DN的裁剪技术的有根有据的理解。目前的文献在很大程度上仍然缺乏经验,缺乏理论上的理解,无法了解裁剪如何影响DNs的决定边界,如何解释裁剪剪裁,以及如何设计相应的原则裁剪裁技术。为了解决这些问题,我们提议利用对Coral Pacwith Affine(CPA) DNs进行理论分析的最新进展。 从这个角度,我们将能够检测早期鸟类(EB)的票机现象,为目前的裁剪裁技术提供解释性裁剪裁战略。在我们研究的每一个阶段,我们进行广泛的实验,支持我们主张和结果;我们的主要目标是加强目前对DNrent propard prans ruding ruding des des desing the the pruding design the pruding the proaching the produning prodestruting the progy prodestrutting the machnal