Data redundancy is ubiquitous in the inputs and intermediate results of Deep Neural Networks (DNN). It offers many significant opportunities for improving DNN performance and efficiency and has been explored in a large body of work. These studies have scattered in many venues across several years. The targets they focus on range from images to videos and texts, and the techniques they use to detect and exploit data redundancy also vary in many aspects. There is not yet a systematic examination and summary of the many efforts, making it difficult for researchers to get a comprehensive view of the prior work, the state of the art, differences and shared principles, and the areas and directions yet to explore. This article tries to fill the void. It surveys hundreds of recent papers on the topic, introduces a novel taxonomy to put the various techniques into a single categorization framework, offers a comprehensive description of the main methods used for exploiting data redundancy in improving multiple kinds of DNNs on data, and points out a set of research opportunities for future to explore.
翻译:深神经网络(DNN)的投入和中间结果中,数据冗余现象无处不在,它为改进DNN的性能和效率提供了许多重要机会,并在大量工作中进行了探讨。这些研究在过去几年中分散在许多地点,其目标从图像到视频和文本,它们用于探测和利用数据冗余的技术在许多方面也各不相同。尚未对许多努力进行系统审查和总结,使研究人员难以全面了解先前的工作、最新水平、差异和共享原则以及有待探索的领域和方向。这一文章试图填补空白。它调查了最近有关这一专题的数百篇论文,提出了将各种技术纳入单一分类框架的新分类方法,全面介绍了在改进数据DNNM的多种类型中利用数据冗余的主要方法,并指出了今后探索的一系列研究机会。