As a deep learning model typically contains millions of trainable weights, there has been a growing demand for a more efficient network structure with reduced storage space and improved run-time efficiency. Pruning is one of the most popular network compression techniques. In this paper, we propose a novel unstructured pruning pipeline, Attention-based Simultaneous sparse structure and Weight Learning (ASWL). Unlike traditional channel-wise or weight-wise attention mechanism, ASWL proposed an efficient algorithm to calculate the pruning ratio through layer-wise attention for each layer, and both weights for the dense network and the sparse network are tracked so that the pruned structure is simultaneously learned from randomly initialized weights. Our experiments on MNIST, Cifar10, and ImageNet show that ASWL achieves superior pruning results in terms of accuracy, pruning ratio and operating efficiency when compared with state-of-the-art network pruning methods.
翻译:由于深层次学习模式通常包含数以百万计的可训练重量,因此人们日益需要一个效率更高的网络结构,减少存储空间,提高运行时间效率。 Prutning是最受欢迎的网络压缩技术之一。在本文中,我们提出了一个新的非结构化管道、基于注意的同步稀疏结构和体重学习(ASWL ) 。 与传统渠道或重力关注机制不同,ASWL提出了一种有效的算法,通过对各层进行分层关注来计算裁剪比率,同时跟踪密度大网络和稀疏网络的重量,以便同时从随机初始化的重量中学习修剪剪裁结构。 我们在MMIST、Cifar10和图像网络的实验显示,ASWL在精度、剪裁率和运行效率方面,与最先进的网络操纵方法相比,在精度、速率和运行效率方面都取得了超强的裁剪裁结果。