Deep learning has achieved impressive results in many areas, but the deployment of edge intelligent devices is still very slow. To solve this problem, we propose a novel compression and acceleration method based on data distribution characteristics for deep neural networks, namely Pruning Filter via Gaussian Distribution Feature (PFGDF). Compared with previous advanced pruning methods, PFGDF compresses the model by filters with insignificance in distribution, regardless of the contribution and sensitivity information of the convolution filter. PFGDF is significantly different from weight sparsification pruning because it does not require the special accelerated library to process the sparse weight matrix and introduces no more extra parameters. The pruning process of PFGDF is automated. Furthermore, the model compressed by PFGDF can restore the same performance as the uncompressed model. We evaluate PFGDF through extensive experiments, on CIFAR-10, PFGDF compresses the convolution filter on VGG-16 by 66.62% with more than 90% parameter reduced, while the inference time is accelerated by 83.73% on Huawei MATE 10.
翻译:在许多领域,深层的学习取得了令人印象深刻的成果,但边缘智能设备的部署仍然非常缓慢。为了解决这个问题,我们提出基于深神经网络数据分布特点的新压缩和加速方法,即通过高森分布特征(PFGDF)的“普鲁宁过滤器”与先前的高级修剪方法相比,PFGDF通过分布无足轻重的过滤器压缩模型,而不管卷发过滤器的贡献和敏感信息如何。PFGDF与重力吸附处理功能有很大不同,因为它不需要特别加速的图书馆来处理稀疏的重量矩阵,也没有引入更多的参数。PFFGDF的运行过程是自动化的。此外,PFGDF压缩的模型可以恢复与未压缩模型相同的性能。我们通过广泛的实验,在CIFAR-10上,PFGDF将VG-16上的脉冲过滤器压缩了66.62%,同时将90%以上的参数降低,而推导速度加快了Huawei MATE 10的83.73%。