HADES：使用节能的近似字母集乘法器在DNN加速器中实现硬件/算法协同设计 (HADES: Hardware/Algorithm Co-design in DNN accelerators using Energy-efficient Approximate Alphabet Set Multipliers)

Edge computing must be capable of executing computationally intensive algorithms, such as Deep Neural Networks (DNNs) while operating within a constrained computational resource budget. Such computations involve Matrix Vector Multiplications (MVMs) which are the dominant contributor to the memory and energy budget of DNNs. To alleviate the computational intensity and storage demand of MVMs, we propose circuit-algorithm co-design techniques with low-complexity approximate Multiply-Accumulate (MAC) units derived from the principles of Alphabet Set Multipliers (ASMs). Selection of few and proper alphabets from ASMs lead to a Multiplier-less DNN implementation, and enables encoding of low precision weights and input activations into fewer bits. To maintain accuracy under alphabet set approximations, we developed a novel ASM-alphabet aware training. The proposed low-complexity multiplication-aware algorithm was implemented In-Memory and Near-Memory with efficient shift operations to further improve the data-movement cost between memory and processing unit. We benchmark our design on CIFAR10 and ImageNet datasets for ResNet and MobileNet models and attain <1-2% accuracy degradation against full precision with energy benefits of >50% compared to standard Von-Neumann counterpart.

翻译：边缘计算必须具备在计算资源预算受限的情况下执行这样的计算密集型算法（例如Deep Neural Networks（DNNs））的能力。此类计算涉及矩阵向量乘法（MVM），这是DNN的内存和能量预算的主要贡献者。为了减轻MVM的计算强度和存储需求，我们提出了使用来自Alphabet Set Multipliers（ASMs）原理的低复杂度近似乘累加（MAC）单元进行电路-算法协同设计技术。从ASMs中选择少量的适当字母可以导致无乘法器的DNN实现，并且使得低精度权重和输入激活可以被编码为更少的比特。为了在字母集近似下保持准确性，我们开发了一种新颖的ASM-字母集感知训练。该提出的低复杂度乘法感知算法在内存中和接近内存中实现，使用有效的位移操作进一步提高了在内存和处理单元之间的数据移动成本。我们在ResNet和MobileNet模型的CIFAR10和ImageNet数据集上进行了基准测试，并与标准的Von-Neumann对应物相比，获得<1-2％的精度降低和> 50％的能量收益。

相关内容

Alphabet

关注 1

Alphabet is mostly a collection of companies. This newer Google is a bit slimmed down, with the companies that are pretty far afield of our main internet products contained in Alphabet instead.

https://abc.xyz/

Transformer推理的全栈优化综述

专知会员服务

83+阅读 · 2023年3月4日

【EPFL博士论文】基于transformer的高效语音识别，162页pdf

专知会员服务

45+阅读 · 2023年2月18日

【深度神经网络加速器的硬件近似技术综述】Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

专知会员服务

16+阅读 · 2022年3月17日

【硬核书】矩阵代数基础，248页pdf

专知会员服务