Over-parameterization of neural networks benefits the optimization and generalization yet brings cost in practice. Pruning is adopted as a post-processing solution to this problem, which aims to remove unnecessary parameters in a neural network with little performance compromised. It has been broadly believed the resulted sparse neural network cannot be trained from scratch to comparable accuracy. However, several recent works (e.g., [Frankle and Carbin, 2019a]) challenge this belief by discovering random sparse networks which can be trained to match the performance with their dense counterpart. This new pruning paradigm later inspires more new methods of pruning at initialization. In spite of the encouraging progress, how to coordinate these new pruning fashions with the traditional pruning has not been explored yet. This survey seeks to bridge the gap by proposing a general pruning framework so that the emerging pruning paradigms can be accommodated well with the traditional one. With it, we systematically reflect the major differences and new insights brought by these new pruning fashions, with representative works discussed at length. Finally, we summarize the open questions as worthy future directions.
翻译:神经网络的超临界化有利于优化和普及,但实际上却带来成本。 谨慎被作为这一问题的后处理解决方案,目的是消除神经网络中不必要的参数,而其性能很少受损。 人们普遍认为,由此形成的零星神经网络不可能从零到相似的精确度得到训练。然而,最近的一些工程(例如,[Frankle和Carbin, 2019a])通过发现随机稀疏的网络来挑战这一信念,这些网络可以经过培训,使其与密集的对口单位的性能相匹配。这个新的修剪模式后来激发了更新的裁剪裁方法。尽管取得了令人鼓舞的进展,但如何将这些新的裁剪动式与传统的裁剪动方式协调起来,还没有得到探讨。这项调查试图通过提出一个总的剪动框架来弥合这一差距,从而能够很好地适应正在形成的剪动的模式与传统模式。我们系统地反映了这些新的剪动模式所带来的重大差异和新的洞见,并详细讨论了有代表性的工作。最后,我们总结了这些开放的问题,作为值得注意的未来方向。