Energy-efficiency is a key concern for neural network applications. To alleviate this issue, hardware acceleration using FPGAs or GPUs can provide better energy-efficiency than general-purpose processors. However, further improvement of the energy-efficiency of such accelerators will be extremely beneficial specially to deploy neural network in power-constrained edge computing environments. In this paper, we experimentally explore the potential of device-level energy-efficiency techniques (e.g.,supply voltage underscaling, frequency scaling, and data quantization) for representative off-the-shelf FPGAs compared to GPUs. Frequency scaling in both platforms can improve the power and energy consumption but with performance overhead, e.g.,in GPUs it improves the power consumption and GOPs/J by up to 34% and 28%, respectively. However, leveraging reduced-precision instructions improves power (up to 13%), energy (up to 20%), and performance (up to 7%) simultaneously, with negligible reduction in accuracy of neural network accuracy.
翻译:能源效率是神经网络应用的关键关切。为了缓解这一问题,使用FPGAs或GPUs的硬件加速可以比一般用途处理器提供更好的能源效率。然而,进一步提高这种加速器的能源效率将特别有利于在受电力限制的边缘计算环境中部署神经网络。在本文件中,我们实验探索设备级能源效率技术(例如,供应电压过低缩、频率缩放和数据量化)的潜力,用于具有代表性的FPGAs与GPUs相比能够提供更好的能源效率。两个平台的频率提升可以改善电力和能源消耗,但随着性能管理,例如,GPUS的频率提高,能耗和GOP/J的能效将分别提高到34%和28%。然而,利用降低的精确度指示可以同时提高电力(高达13%)、能源(高达20%)和性能(高达7%),而神经网络准确性微乎其微地降低。