FireFly: 用于Spiking神经网络的高压和可重新配置硬件加速器 (FireFly: A High-Throughput and Reconfigurable Hardware Accelerator for Spiking Neural Networks)

Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. With the introduction of the backpropagation algorithm and surrogate gradient, the structure of spiking neural networks has become more complex, and the performance gap with artificial neural networks has gradually decreased. However, most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements, which significantly restricts the development of SNNs. They do not delve into the arithmetic operations between the binary spikes and synaptic weights or assume unlimited on-chip RAM resources by using overly expensive devices on small tasks. To improve arithmetic efficiency, we analyze the neural dynamics of spiking neurons, generalize the SNN arithmetic operation to the multiplex-accumulate operation, and propose a high-performance implementation of such operation by utilizing the DSP48E2 hard block in Xilinx Ultrascale FPGAs. To improve memory efficiency, we design a memory system to enable efficient synaptic weights and membrane voltage memory access with reasonable on-chip RAM consumption. Combining the above two improvements, we propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly). FireFly is implemented on several FPGA edge devices with limited resources but still guarantees a peak performance of 5.53TSOP/s at 300MHz. As a lightweight accelerator, FireFly achieves the highest computational density efficiency compared with existing research using large FPGA devices.

翻译：Spik神经网络(SNN)由于其强大的生物可解释性和高能效而被广泛使用。随着采用后伸伸缩算法和代位梯度,神经网络的结构变得更加复杂,人造神经网络的性能差距逐渐缩小。然而,大多数SNN硬件用于外地可编程门阵列(FPGAs)无法满足算术或记忆效率的要求,这大大限制了SNNS的发展。它们不会进入二进制钉和合成重量之间的算术操作,或者通过使用过于昂贵的小型任务装置承担无限的机头内神经记录仪资源。为了提高算术效率,我们分析了神经网络的神经网络神经动态,将SNNN计算操作与多重加速器的累积操作进行总体化,并提议通过使用XilinxUlleral的DSP48E2硬块来高性能地执行这种操作。为了提高记忆效率,我们设计一个内存系统,使精密的同步重量超过5.FSO值的神经设备,并且以高压的机尾部智能智能智能智能智能智能系统进行一个合理的读取过程。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日