Deep neural networks (DNNs) are widely applied to artificial intelligence applications, achieving promising performance at the cost of massive computation, large power consumption, and high latency. Diverse solutions have been proposed to cope with the challenge of latency and power consumption, including light-weight neural networks and efficient hardware accelerators. Moreover, research on quantization reduces the cost of computation and shows the error resiliency of DNNs. To improve the latency and power efficiency of hardware accelerators by exploiting the error resiliency, we propose an application-specific optimization method for the automatic design of approximate multipliers for DNNs. The proposed method optimizes an approximate multiplier by minimizing the error according to the probability distributions extracted from DNNs. By applying the optimized approximate multiplier to a DNN, we obtain 1.60%, 15.32%, and 20.19% higher accuracies than the best reproduced approximate multiplier on the widely used MNIST, FashionMNIST, and CIFAR-10 datasets, respectively, with 12.17% smaller area, 23.38% less power consumption, and 16.53% lower latency. Compared with an exact multiplier, the optimized multiplier reduces the area, power consumption, and latency by 36.88%, 52.45%, and 26.63%, respectively. Applied to FPGA-based and ASIC-based DNN accelerator modules, our approximate multiplier obtains low LUT utilization and small area respectively with competitive max frequency and power consumption, which shows the effectiveness of the proposed method in reducing the hardware cost of DNN accelerators.
翻译:深心神经网络(DNNS)被广泛应用于人工智能应用,以大规模计算、大量电耗和高悬浮等代价实现有希望的性能,实现了以大规模计算、大量电耗和高悬浮度为代价的有希望的性能。已经提出了多种解决方案来应对延缓和电耗的挑战,包括轻量神经网络和高效硬件加速器。此外,关于四分化的研究降低了计算成本并显示了DNS的错应变能力。为了利用差错弹性来提高硬件加速器的耐用率和功率效率,我们为DNNS的自动设计提出了一种针对具体应用的优化方法。拟议方法通过根据从DNNS提取的概率分布来尽量减少差错率和电动消耗率,包括轻量神经网络网络网络和高效硬件加速器。 通过对DNNNT应用优化的近似乘数,我们获得了1.60%、15.32和20.19%的读数率比广泛使用的MNIST、FS、FASM和CIF10数据集中最接近的增速率,分别减少了12.17%的D.38%的耗用率、16.53%的耗力和最低耗力区域。