Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs. Deep neural networks (DNNs) have largely boosted their performances on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Though recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue of DNNs have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this paper, we present the review of the recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related visual recognition approaches. We investigate not only from the model but also the data point of view (which is not the case in existing surveys), and focus on three most studied data types (images, videos and points). This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems.
翻译:深神经网络(DNN)在大量培训数据和新的强大计算资源的帮助下,在很大程度上提高了许多具体任务的业绩。虽然承认准确性通常是对新进展的第一关注,但效率实际上相当重要,有时对学术研究和工业应用都至关重要。此外,对于整个社区而言,也需要对效率的机会和挑战有深刻的认识。尽管从各种角度对DNN的效率问题进行了一般性调查,但就我们所知,其中几乎没有任何一个侧重于系统视觉识别,因此不清楚哪些进展适用于它,应当关注哪些其他方面。在本文件中,我们回顾了最近的进展,并提出了关于提高DNNN的视觉识别方法效率的可能新方向的建议。我们不仅从模型,而且从数据角度(在现有的调查中不是这种情况)对DNN的效率问题进行了调查。我们不仅从各种角度对数据观点进行了调查(在现有的调查中不是这种情况),而且侧重于通过三种系统调查的尝试,即通过一个全面的图像和操作者,为一份文件的浏览者提供了一种了解。