Network pruning is a promising avenue for compressing deep neural networks. A typical approach to pruning starts by training a model and removing unnecessary parameters while minimizing the impact on what is learned. Alternatively, a recent approach shows that pruning can be done at initialization prior to training. However, it remains unclear exactly why pruning an untrained, randomly initialized neural network is effective. In this work, we consider the pruning problem from a signal propagation perspective, formally characterizing initialization conditions that ensure faithful signal propagation throughout a network. Based on singular values of a network's input-output Jacobian, we find that orthogonal initialization enables more faithful signal propagation compared to other initialization schemes, thereby enhancing pruning results on a range of modern architectures and datasets. Also, we empirically study the effect of supervision for pruning at initialization, and show that often unsupervised pruning can be as effective as the supervised pruning. Furthermore, we demonstrate that our signal propagation perspective, combined with unsupervised pruning, can indeed be useful in various scenarios where pruning is applied to non-standard arbitrarily-designed architectures.
翻译:网络运行是压缩深层神经网络的一个充满希望的渠道。 典型的运行方法是培训模型和删除不必要的参数,同时尽量减少对所学知识的影响。 或者, 最近的方法显示, 在培训前初始化时可以完成运行。 但是, 仍然不清楚为什么没有经过训练的、 随机初始化的神经网络是有效的。 在这项工作中, 我们从信号传播角度来考虑运行问题, 正式描述初始化条件, 以确保网络的忠实信号传播。 根据网络输入输出 Jacobian 的单数值, 我们发现, orthological初始化可以比其他初始化计划更忠实的信号传播, 从而增强一系列现代架构和数据集的运行结果。 此外, 我们实验性地研究对初始化运行的监管效果, 并显示, 通常不受到监管的运行运行运行效果可以和受监管的流程一样有效。 此外, 我们证明, 我们的信号传播观点, 加上未被校准的运行过程, 在各种情景中, 被任意地设计了架构。