Deep neural networks (DNNs) have been successfully applied in various fields. A major challenge of deploying DNNs, especially on edge devices, is power consumption, due to the large number of multiply-and-accumulate (MAC) operations. To address this challenge, we propose PowerPruning, a novel method to reduce power consumption in digital neural network accelerators by selecting weights that lead to less power consumption in MAC operations. In addition, the timing characteristics of the selected weights together with all activation transitions are evaluated. The weights and activations that lead to small delays are further selected. Consequently, the maximum delay of the sensitized circuit paths in the MAC units is reduced even without modifying MAC units, which thus allows a flexible scaling of supply voltage to reduce power consumption further. Together with retraining, the proposed method can reduce power consumption of DNNs on hardware by up to 78.3% with only a slight accuracy loss.
翻译:深度神经网络(DNN)在各个领域都得到了成功应用。部署DNN的主要挑战,特别是在边缘设备上,是功耗过大,因为需要大量的乘加(MAC)操作。为了解决这个挑战,我们提出了PowerPruning,一种在数字神经网络加速器中通过选择导致MAC操作功耗较小的权重来减少功耗的新方法。此外,评估选定权重的时间特性以及所有激活转换。选择导致较小延迟的权重和激活。因此,即使不修改MAC单元,也可以减少MAC单元中敏感电路路径的最大延迟,从而允许灵活缩放供应电压以进一步降低功耗。与重新训练一起,所提出的方法可以使硬件上的DNN功耗降低高达78.3%,仅产生轻微的精度损失。