The rise of neural network (NN) applications has prompted an increased interest in compression, with a particular focus on channel pruning, which does not require any additional hardware. Most pruning methods employ either single-layer operations or global schemes to determine which channels to remove followed by fine-tuning of the network. In this paper we present Gator, a channel-pruning method which temporarily adds learned gating mechanisms for pruning of individual channels, and which is trained with an additional auxiliary loss, aimed at reducing the computational cost due to memory, (theoretical) speedup (in terms of FLOPs), and practical, hardware-specific speedup. Gator introduces a new formulation of dependencies between NN layers which, in contrast to most previous methods, enables pruning of non-sequential parts, such as layers on ResNet's highway, and even removing entire ResNet blocks. Gator's pruning for ResNet-50 trained on ImageNet produces state-of-the-art (SOTA) results, such as 50% FLOPs reduction with only 0.4%-drop in top-5 accuracy. Also, Gator outperforms previous pruning models, in terms of GPU latency by running 1.4 times faster. Furthermore, Gator achieves improved top-5 accuracy results, compared to MobileNetV2 and SqueezeNet, for similar runtimes. The source code of this work is available at: https://github.com/EliPassov/gator.
翻译:神经网络(NNN)应用程序的兴起促使人们对压缩的兴趣增加, 特别侧重于频道运行速度( FLOPs ) 和实用的硬件专用网络加速。 大多数运行方法使用单层操作或全球计划来确定要删除的频道, 然后对网络进行微调。 在本文中, 我们展示了 Gator, 这是一种频道运行方法, 暂时增加了单个频道运行的学习引导机制, 并经过额外的辅助损失培训, 目的是减少由于记忆、 (理论) 加速( FLOPs ) 和实用的、 硬件专用的网络加速而导致的计算成本。 Gator 在 NNNE 层之间引入了一种依赖性的新公式, 与大多数以往的方法不同, 能够将非序列部分( 如 ResNet 高速公路上的层) 进行剪裁剪裁, 甚至删除整个 ResNet 区块。 Gator 在图像网络上培训的ResNet- 50 运行运行的 RET( SOTA) 源 产生状态- the art (STO) 的结果, 例如 FLOPs 递减50% FOPs 和 loveop- drode- fortime rodeal deal deal deal deal deal deminal deal dealateslatesmations.