We propose an efficient and straightforward method for compressing deep convolutional neural networks (CNNs) that uses basis filters to represent the convolutional layers, and optimizes the performance of the compressed network directly in the basis space. Specifically, any spatial convolution layer of the CNN can be replaced by two successive convolution layers: the first is a set of three-dimensional orthonormal basis filters, followed by a layer of one-dimensional filters that represents the original spatial filters in the basis space. We jointly fine-tune both the basis and the filter representation to directly mitigate any performance loss due to the truncation. Generality of the proposed approach is demonstrated by applying it to several well known deep CNN architectures and data sets for image classification and object detection. We also present the execution time and power usage at different compression levels on the Xavier Jetson AGX processor.
翻译:我们建议一种高效和直截了当的压缩深卷动神经网络的方法,该方法使用基过滤器代表进化层,并直接优化在基础空间的压缩网络的性能。 具体地说,CNN的任何空间进化层都可以被两个连续的演化层所取代:第一层是一套三维正态过滤器,随后是一层一维过滤器,代表基础空间中的原始空间过滤器。我们共同微调基础和过滤器,以直接减轻由于脱轨而造成的任何性能损失。拟议方法的一般性表现是将其应用到几个著名的CNN深层结构和数据集,用于图像分类和天体探测。我们还在Xavier Jetson AGX处理器上展示了不同压缩水平的执行时间和功率。