A high-accuracy CNN is often accompanied by huge parameters, which are usually stored in the high-dimensional tensors. However, there are few methods can figure out the redundant information of the parameters stored in the high-dimensional tensors, which leads to the lack of theoretical guidance for the compression of CNNs. In this paper, we propose a novel theory to find redundant information in three dimensional tensors, namely Quantified Similarity of Feature Maps (QSFM), and use this theory to prune convolutional neural networks to enhance the inference speed. Our method belongs to filter pruning, which can be implemented without using any special libraries. We perform our method not only on common convolution layers but also on special convolution layers, such as depthwise separable convolution layers. The experiments prove that QSFM can find the redundant information in the neural network effectively. Without any fine-tuning operation, QSFM can compress ResNet-56 on CIFAR-10 significantly (48.27% FLOPs and 57.90% parameters reduction) with only a loss of 0.54% in the top-1 accuracy. QSFM also prunes ResNet-56, VGG-16 and MobileNetV2 with fine-tuning operation, which also shows excellent results.
翻译:高精度CNN往往配有巨大的参数,这些参数通常储存在高维抗量中。然而,几乎没有什么方法可以找出高维抗量中储存的参数的多余信息,这导致对压缩CNN的理论指导缺乏。在本文中,我们提出了一个新理论,在三维抗量中找到冗余信息,即量化的特征地图相似性(QSFM),并用这一理论来模拟脉冲神经网络,以提高推断速度。我们的方法属于过滤运行,可以在不使用任何特殊图书馆的情况下加以实施。我们不仅在共同的熔化层中执行我们的方法,而且还在特殊变量层中执行我们的方法,例如深度的相等。实验证明,QSFMM可以在神经网络中有效地找到冗余信息。如果没有任何微调操作,QSFM可以大量压缩CIFAR-10的ResNet-56网络(48.27 % FLOPs和57.90%参数减少),只有0.54%的损失,在最高一级-16网络中也显示精度的移动-VMFM结果。