Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.
翻译:深相神经网络(CNNs)最近在许多视觉识别任务中取得了巨大成功,然而,现有的深神经网络模型在计算上成本昂贵,记忆密集,阻碍了在低记忆资源设备或有严格潜伏要求的应用中部署这些模型,因此,自然考虑是在深网络中执行模型压缩和加速,而不会显著降低模型性能。在过去几年中,这一领域取得了巨大进展。在本文件中,我们调查了最近开发的压缩和加速CNN模型的先进技术。这些技术大致分为四个方案:参数调整和共享、低等级系数化、转移/组合的脉冲过滤器和知识蒸馏。参数调整和共享方法将在开始时加以说明,在其他技术推出后,将对其他技术加以说明。我们每个方案都对性能、相关应用、优势和缺点等进行了深刻的分析。然后,我们将通过几个最近的成功方法,例如动态能力网络和深度网络。这些技术将大致分为四个方案:参数调整和共享、低等级要素化、转移/组合式递增过滤器过滤器和知识蒸馏器。之后,参数调整和共享方法将在开始时说明方法,在采用其他方法,然后将说明其他方法,然后将介绍其他方法,然后讨论用于评估这一模型和可能的进度。