While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully chosen trade-off between performance and resource consumption in terms of computation and energy. The development of such approaches is among the major challenges in current machine learning research and key to ensure a smooth transition of machine learning technology from a scientific environment with virtually unlimited computing resources into everyday's applications. In this article, we provide an overview of the current state of the art of machine learning techniques facilitating these real-world requirements. In particular, we focus on deep neural networks (DNNs), the predominant machine learning models of the past decade. We give a comprehensive overview of the vast literature that can be mainly split into three non-mutually exclusive categories: (i) quantized neural networks, (ii) network pruning, and (iii) structural efficiency. These techniques can be applied during training or as post-processing, and they are widely used to reduce the computational demands in terms of memory footprint, inference speed, and energy efficiency. We also briefly discuss different concepts of embedded hardware for DNNs and their compatibility with machine learning techniques as well as potential for energy and latency reduction. We substantiate our discussion with experiments on well-known benchmark datasets using compression techniques (quantization, pruning) for a set of resource-constrained embedded systems, such as CPUs, GPUs and FPGAs. The obtained results highlight the difficulty of finding good trade-offs between resource efficiency and predictive performance.
翻译:虽然机械学习传统上是一项资源密集的任务,内嵌系统、自主导航和互联网的愿景,有助于激发对资源节约方法的兴趣。这些方法的目的是在计算和能源方面谨慎选择业绩和资源消耗的权衡。这些方法的开发是当前机器学习研究的主要挑战之一,是确保机器学习技术从几乎无限计算资源的科学环境顺利过渡到日常应用的关键。在本篇文章中,我们概述了促进这些现实世界需求的机器学习技术的目前状况。我们特别侧重于深层神经网络(DNN),即过去十年的主要机器学习模式。我们全面概述了大量文献,这些文献主要可以分为三种非外向独家类别:(一) 量化的神经网络,(二) 网络运行,以及(三) 结构效率。这些技术可以在培训或后处理中应用,并广泛用于减少记忆足迹、推断速度和能源效率之间的计算难度。我们还简要地讨论了大量文献学概念,即用于为DNNN和稳定数据采集而采用的各种硬件的升级和升级技术。我们用固定的硬性技术来为DNNNF的升级和精确性研究。