Implementing embedded neural network processing at the edge requires efficient hardware acceleration that couples high computational performance with low power consumption. Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly updated and improved. To evaluate and compare hardware design choices, designers can refer to a myriad of accelerator implementations in the literature. Surveys provide an overview of these works but are often limited to system-level and benchmark-specific performance metrics, making it difficult to quantitatively compare the individual effect of each utilized optimization technique. This complicates the evaluation of optimizations for new accelerator designs, slowing-down the research progress. This work provides a survey of neural network accelerator optimization approaches that have been used in recent works and reports their individual effects on edge processing performance. It presents the list of optimizations and their quantitative effects as a construction kit, allowing to assess the design choices for each building block separately. Reported optimizations range from up to 10'000x memory savings to 33x energy reductions, providing chip designers an overview of design choices for implementing efficient low power neural network accelerators.
翻译:在边缘实施嵌入的神经网络处理需要高效的硬件加速,使双方的计算性能高,电力消耗低。在网络结构及其算法特点的快速演变的驱动下,加速器设计不断得到更新和改进。为了评估和比较硬件设计选择,设计师可以在文献中提及各种加速器实施过程。调查提供了这些工程的概况,但往往局限于系统级和基准级的性能衡量标准,使得难以从数量上比较每个使用过的优化技术的个别效果。这增加了评估新加速器设计优化、减缓研究进度的难度。这项工作对近期工作中使用的神经网络加速器优化方法进行了调查,并报告了其对边缘处理性能的个别影响。它提供了优化及其作为建筑工具包的定量效果清单,从而可以分别评估每个建筑块的设计选择。报告的优化从最多到10 000x记忆节减速到33x节能,为芯片设计师提供了实施高效低电动网络加速器的设计选择概览。