For many years, the family of convolutional neural networks (CNNs) has been a workhorse in deep learning. Recently, many novel CNN structures have been designed to address increasingly challenging tasks. To make them work efficiently on edge devices, researchers have proposed various structured network pruning strategies to reduce their memory and computational cost. However, most of them only focus on reducing the number of filter channels per layer without considering the redundancy within individual filter channels. In this work, we explore pruning from another dimension, the kernel size. We develop a CNN pruning framework called SMOF, which Squeezes More Out of Filters by reducing both kernel size and the number of filter channels. Notably, SMOF is friendly to standard hardware devices without any customized low-level implementations, and the pruning effort by kernel size reduction does not suffer from the fixed-size width constraint in SIMD units of general-purpose processors. The pruned networks can be deployed effortlessly with significant running time reduction. We also support these claims via extensive experiments on various CNN structures and general-purpose processors for mobile devices.
翻译:多年来,进化神经网络(CNNs)的家族一直是深层学习的一匹工马。最近,许多新型CNN结构设计成一个称为SMOF的CNN运行框架,通过减少内核大小和过滤通道的数量,使这些结构化的CNN结构能够有效地发挥作用。为了使这些结构化网络运行起来,研究人员提出了各种结构化的网络运行战略,以减少记忆和计算成本;然而,其中大多数只是侧重于减少每个层的过滤渠道的数量,而没有考虑到单个过滤渠道的冗余。在这项工作中,我们探索从另一个层面,即内核体大小进行运行。我们开发了一个CNN的运行框架,称为SMOF,通过减少内核体大小和过滤通道的数量,将过滤器挤出更多的过滤器。值得注意的是,SMOF对标准硬件设备是友好的,而没有定制的低级实施,而内核规模的运行努力并不受到一般用途处理器的固定尺寸的宽度限制。运行网络可以不费力地部署,大量减少时间。我们还通过对各种CNNCN结构和移动装置的普通用途处理器进行广泛的试验来支持这些主张。