Recent advances in Artificial Intelligence (AI) on the Internet of Things (IoT)-enabled network edge has realized edge intelligence in several applications such as smart agriculture, smart hospitals, and smart factories by enabling low-latency and computational efficiency. However, deploying state-of-the-art Convolutional Neural Networks (CNNs) such as VGG-16 and ResNets on resource-constrained edge devices is practically infeasible due to their large number of parameters and floating-point operations (FLOPs). Thus, the concept of network pruning as a type of model compression is gaining attention for accelerating CNNs on low-power devices. State-of-the-art pruning approaches, either structured or unstructured do not consider the different underlying nature of complexities being exhibited by convolutional layers and follow a training-pruning-retraining pipeline, which results in additional computational overhead. In this work, we propose a novel and computationally efficient pruning pipeline by exploiting the inherent layer-level complexities of CNNs. Unlike typical methods, our proposed complexity-driven algorithm selects a particular layer for filter-pruning based on its contribution to overall network complexity. We follow a procedure that directly trains the pruned model and avoids the computationally complex ranking and fine-tuning steps. Moreover, we define three modes of pruning, namely parameter-aware (PA), FLOPs-aware (FA), and memory-aware (MA), to introduce versatile compression of CNNs. Our results show the competitive performance of our approach in terms of accuracy and acceleration. Lastly, we present a trade-off between different resources and accuracy which can be helpful for developers in making the right decisions in resource-constrained IoT environments.
翻译:互联网上由网络带动的人工智能(AI)的最近进展,在智能农业、智能医院和智能工厂等多种应用领域,如智能农业、智能医院和智能工厂等,通过低时空和计算效率,实现了优势智能智能。然而,在资源紧缺的边缘设备上部署最先进的革命神经网络(CNN),如VGG-16和ResNet等,实际上是行不通的,因为其参数和浮动点操作(FLOPs)数量众多,因此,网络运行作为一种模型压缩的理念在智能农业、智能医院和智能工厂等若干应用领域都获得了优势智能智能。 国家工艺型的裁剪裁方法,无论是结构化还是非结构化的,都没有考虑到由革命层所展示的复杂情况的不同根本性质,而是遵循一个训练性再培训管道,从而导致额外的计算性管理。 在这项工作中,我们提出一个新的和计算性高效的管道,是利用CNN的内在层次层次复杂性。 与典型的方法不同,我们提出的复杂程度的精密算算算法选择了网络的精细的精度,也就是在过滤系统结构中选择一个特定的精细的精细的精细的精度,也就是,也就是,即我们不断升级的计算中,在升级的顺序上,在我们的系统上可以解释。