Deep Neural Networks (DNNs), as a subset of Machine Learning (ML) techniques, entail that real-world data can be learned and that decisions can be made in real-time. However, their wide adoption is hindered by a number of software and hardware limitations. The existing general-purpose hardware platforms used to accelerate DNNs are facing new challenges associated with the growing amount of data and are exponentially increasing the complexity of computations. An emerging non-volatile memory (NVM) devices and processing-in-memory (PIM) paradigm is creating a new hardware architecture generation with increased computing and storage capabilities. In particular, the shift towards ReRAM-based in-memory computing has great potential in the implementation of area and power efficient inference and in training large-scale neural network architectures. These can accelerate the process of the IoT-enabled AI technologies entering our daily life. In this survey, we review the state-of-the-art ReRAM-based DNN many-core accelerators, and their superiority compared to CMOS counterparts was shown. The review covers different aspects of hardware and software realization of DNN accelerators, their present limitations, and future prospectives. In particular, comparison of the accelerators shows the need for the introduction of new performance metrics and benchmarking standards. In addition, the major concerns regarding the efficient design of accelerators include a lack of accuracy in simulation tools for software and hardware co-design.
翻译:作为机器学习(ML)技术的一个子集,深神经网络(DNN)作为机器学习(ML)技术的一个子集,意味着可以学习真实世界的数据,并且可以实时作出决定;然而,由于软件和硬件的限制,这些数据的广泛采用受到许多限制的阻碍;用于加速DNN的现有通用硬件平台正面临与数据数量不断增加相关的新挑战,而且计算的复杂性正在急剧增加。一个新兴的非挥发性记忆(NVM)装置和处理模拟(PIM)模式正在创造新的硬件结构,其计算和储存能力有所增加。特别是,转向基于ReRAM的模拟计算机在实施领域和电力高效的推论以及培训大型神经网络结构方面有很大的潜力。这些平台可以加速以IOT为支撑的AI技术进入我们日常生活的进程。在本次调查中,我们审查了基于RAM的状态-再定位的DNNNM多加速器,以及它们与CMOS对应方相比的优势正在形成一个新的硬件和软件配置工具的优势。审查包括了目前硬件和软件的标准化的各种不同方面。