Network pruning can significantly reduce the computation and memory footprint of large neural networks. To achieve a good trade-off between model size and performance, popular pruning techniques usually rely on hand-crafted heuristics and require manually setting the compression ratio for each layer. This process is typically time-consuming and requires expert knowledge to achieve good results. In this paper, we propose NAP, a unified and automatic pruning framework for both fine-grained and structured pruning. It can find out unimportant components of a network and automatically decide appropriate compression ratios for different layers, based on a theoretically sound criterion. Towards this goal, NAP uses an efficient approximation of the Hessian for evaluating the importances of components, based on a Kronecker-factored Approximate Curvature method. Despite its simpleness to use, NAP outperforms previous pruning methods by large margins. For fine-grained pruning, NAP can compress AlexNet and VGG16 by 25x, and ResNet-50 by 6.7x without loss in accuracy on ImageNet. For structured pruning (e.g. channel pruning), it can reduce flops of VGG16 by 5.4x and ResNet-50 by 2.3x with only 1% accuracy drop. More importantly, this method is almost free from hyper-parameter tuning and requires no expert knowledge. You can start NAP and then take a nap!
翻译:网络运行可以大大降低大型神经网络的计算和记忆足迹。 为了在模型大小和性能之间实现良好的平衡, 流行的裁剪技术通常依赖手工制作的超光速, 需要手工设定每个层的压缩比例。 这一过程通常耗时耗时, 需要专家知识才能取得良好结果。 在本文中, 我们为精细和结构化的裁剪提出NAP, 是一个统一的自动裁剪框架。 它可以发现网络的不重要的组件, 并自动决定不同层的适当的压缩比率, 并以理论上健全的标准为基础。 为了实现这一目标, NAP通常使用高效的赫斯仪近似来评估组件的重要性, 以Kronecker- faced Appractical 方法为基础。 尽管这个过程很简单, NAP比以往的裁剪裁方法要长得多。 对于精细裁剪裁的裁操作, NAP可以将AlexNet和VGG16 自动压缩成25x, ResNet- 50 和ResNet- 60x 在图像网络上不造成准确性损失。 对于结构化的 prunning prunning (e.g- g. hex) roup- groupx) 的精裁剪裁剪裁, 和 must axx roup- dex rodustrupx rodustrup