Channel pruning is a popular technique for compressing convolutional neural networks (CNNs), where various pruning criteria have been proposed to remove the redundant filters. From our comprehensive experiments, we found two blind spots in the study of pruning criteria: (1) Similarity: There are some strong similarities among several primary pruning criteria that are widely cited and compared. According to these criteria, the ranks of filters'Importance Score are almost identical, resulting in similar pruned structures. (2) Applicability: The filters'Importance Score measured by some pruning criteria are too close to distinguish the network redundancy well. In this paper, we analyze these two blind spots on different types of pruning criteria with layer-wise pruning or global pruning. The analyses are based on the empirical experiments and our assumption (Convolutional Weight Distribution Assumption) that the well-trained convolutional filters each layer approximately follow a Gaussian-alike distribution. This assumption has been verified through systematic and extensive statistical tests.
翻译:频道运行是一种压缩卷发神经网络(CNNs)的流行技术,建议采用各种调整标准来清除多余的过滤器。我们从全面实验中发现,在对裁剪标准的研究中发现两个盲点:(1) 相似性:在几个主要裁剪标准之间有一些强烈的相似性,它们被广泛引用和比较。根据这些标准,过滤器的微量分的等级几乎相同,导致相似的修剪结构。(2) 适用性:用某些裁剪标准测量的过滤器的弥散分太近,无法区分网络冗余。在本文中,我们分析了这两种盲点,这些盲点是不同类型的剪裁标准,有按层划线或全球划线。这些分析是基于实验实验实验和我们的假设(ConvolucalWight分布),即经过良好训练的卷发过滤器的每个层大致遵循高斯-类似分布法。这一假设通过系统和广泛的统计测试得到验证。