Deep neural networks have been applied in many applications exhibiting extraordinary abilities in the field of computer vision. However, complex network architectures challenge efficient real-time deployment and require significant computation resources and energy costs. These challenges can be overcome through optimizations such as network compression. This paper provides a survey on two types of network compression: pruning and quantization. We compare current techniques, analyze their strengths and weaknesses, provide guidance for compressing networks, and discuss possible future compression techniques.
翻译:深神经网络已应用于在计算机视觉领域表现出非凡能力的许多应用,然而,复杂的网络结构对高效实时部署提出了挑战,需要大量的计算资源和能源成本。这些挑战可以通过网络压缩等优化来克服。本文对两种类型的网络压缩进行了调查:修剪和量化。我们比较了当前的技术,分析了其优缺点,为压缩网络提供了指导,并讨论了未来可能的压缩技术。