Recent work has explored the possibility of pruning neural networks at initialization. We assess proposals for doing so: SNIP (Lee et al., 2019), GraSP (Wang et al., 2020), SynFlow (Tanaka et al., 2020), and magnitude pruning. Although these methods surpass the trivial baseline of random pruning, they remain below the accuracy of magnitude pruning after training, and we endeavor to understand why. We show that, unlike pruning after training, randomly shuffling the weights these methods prune within each layer or sampling new initial values preserves or improves accuracy. As such, the per-weight pruning decisions made by these methods can be replaced by a per-layer choice of the fraction of weights to prune. This property suggests broader challenges with the underlying pruning heuristics, the desire to prune at initialization, or both.
翻译:最近的工作探索了在初始化时运行神经网络的可能性。我们为此评估了建议:SNIP(Lee等人,2019年),GraSP(Wang等人,2020年),SynFlow(Tanaka等人,2020年),以及规模运行。虽然这些方法超过了随机运行的次要基线,但它们仍然低于在培训后进行规模运行的精确度,我们努力理解原因。我们发现,与培训后修剪不同的是,这些方法在每一层内部随机调整重量,或者对新的初始值进行取样,保存或提高准确性。因此,这些方法作出的人均重量运行决定可以由对纯度的重量部分的每层选择取而代之。这一属性提出了更广泛的挑战,包括基本的修剪的外观、在初始化时或两者兼用。