Compiler frameworks are crucial for the widespread use of FPGA-based deep learning accelerators. They allow researchers and developers, who are not familiar with hardware engineering, to harness the performance attained by domain-specific logic. There exists a variety of frameworks for conventional artificial neural networks. However, not much research effort has been put into the creation of frameworks optimized for spiking neural networks (SNNs). This new generation of neural networks becomes increasingly interesting for the deployment of AI on edge devices, which have tight power and resource constraints. Our end-to-end framework E3NE automates the generation of efficient SNN inference logic for FPGAs. Based on a PyTorch model and user parameters, it applies various optimizations and assesses trade-offs inherent to spike-based accelerators. Multiple levels of parallelism and the use of an emerging neural encoding scheme result in an efficiency superior to previous SNN hardware implementations. For a similar model, E3NE uses less than 50% of hardware resources and 20% less power, while reducing the latency by an order of magnitude. Furthermore, scalability and generality allowed the deployment of the large-scale SNN models AlexNet and VGG.
翻译:编译器框架对于广泛使用基于FPGA的深层学习加速器至关重要。 它们允许不熟悉硬件工程的研究人员和开发者利用特定领域逻辑的性能。 常规人造神经网络存在各种框架。 但是,没有做出太多的研究努力来建立优化神经网络优化框架。 这种新一代的神经网络对于在边缘装置上部署AI越来越有趣,因为边缘装置的功率和资源限制很紧。 我们的端到端框架 E3NE 自动将高效的 SNNN 逻辑生成用于FPGAs。 根据PyTorch 模型和用户参数,它应用了各种优化和评估基于峰值的加速器固有的交易。 多层次的平行和使用正在形成的神经编码计划导致效率优于SNNF硬件的实施。 对于类似的模型, E3NE使用不到50%的硬件资源和20 %的功率,同时用一个规模级的大小的SNNEV模型来降低弹性。 此外, 亚克斯网络的可扩展性和通用性允许大规模部署。