Convolutional neural networks (CNNs) have shown state-of-the-art performance in various applications. However, CNNs are resource-hungry due to their requirement of high computational complexity and memory storage. Recent efforts toward achieving computational efficiency in CNNs involve filter pruning methods that eliminate some of the filters in CNNs based on the \enquote{importance} of the filters. The majority of existing filter pruning methods are either "active", which use a dataset and generate feature maps to quantify filter importance, or "passive", which compute filter importance using entry-wise norm of the filters without involving data. Under a high pruning ratio where large number of filters are to be pruned from the network, the entry-wise norm methods eliminate relatively smaller norm filters without considering the significance of the filters in producing the node output, resulting in degradation in the performance. To address this, we present a passive filter pruning method where the filters are pruned based on their contribution in producing output by considering the operator norm of the filters. The proposed pruning method generalizes better across various CNNs compared to that of the entry-wise norm-based pruning methods. In comparison to the existing active filter pruning methods, the proposed pruning method is at least 4.5 times faster in computing filter importance and is able to achieve similar performance compared to that of the active filter pruning methods. The efficacy of the proposed pruning method is evaluated on audio scene classification and image classification using various CNNs architecture such as VGGish, DCASE21_Net, VGG-16 and ResNet-50.
翻译:卷积神经网络 (CNNs) 在各种应用中表现出最先进的性能。然而,由于其高计算复杂度和存储要求,CNNs 需要资源。近期为了实现 CNNs 的计算效率,涉及到滤波器修剪方法,该方法基于滤波器的“重要性”,消除 CNNs 中的某些滤波器。现有的大多数滤波器修剪方法要么是"主动方法",即使用数据集并生成特征映射来量化滤波器的重要性,要么是"被动方法",其使用滤波器的逐元素范数而不涉及数据来计算滤波器的重要性。在高修剪比率的情况下,需要从网络中删除大量滤波器时,逐元素范数方法消除了相对较小的范数滤波器,而不考虑在产生节点输出时滤波器的重要性,从而导致性能下降。为了解决这个问题,我们提出了一种被动滤波器修剪方法,其中基于滤波器的算子范数来修剪滤波器的贡献量。与逐元素范数修剪方法相比,所提出的修剪方法在各种 CNNs 中具有更好的泛化能力。与现有的主动滤波器修剪方法相比,所提出的修剪方法在计算滤波器重要性方面至少要快 4.5 倍,并且能够达到类似于主动滤波器修剪方法的性能。所提出的修剪方法的有效性是使用各种 CNNs 架构进行音频场景分类和图像分类评估的,如 VGGish, DCASE21_Net, VGG-16 和 ResNet-50。