N3H-Core:通过基于FPGA的异基因计算机核心加速器,实现神经网络加速器 (N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores)

from arxiv, 11 pages, 12 figures, In Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'22), February 27-March 1, 2022, Virtual Event, CA, USA

Accelerating the neural network inference by FPGA has emerged as a popular option, since the reconfigurability and high performance computing capability of FPGA intrinsically satisfies the computation demand of the fast-evolving neural algorithms. However, the popular neural accelerators on FPGA (e.g., Xilinx DPU) mainly utilize the DSP resources for constructing their processing units, while the rich LUT resources are not well exploited. Via the software-hardware co-design approach, in this work, we develop an FPGA-based heterogeneous computing system for neural network acceleration. From the hardware perspective, the proposed accelerator consists of DSP- and LUT-based GEneral Matrix-Multiplication (GEMM) computing cores, which forms the entire computing system in a heterogeneous fashion. The DSP- and LUT-based GEMM cores are computed w.r.t a unified Instruction Set Architecture (ISA) and unified buffers. Along the data flow of the neural network inference path, the computation of the convolution/fully-connected layer is split into two portions, handled by the DSP- and LUT-based GEMM cores asynchronously. From the software perspective, we mathematically and systematically model the latency and resource utilization of the proposed heterogeneous accelerator, regarding varying system design configurations. Through leveraging the reinforcement learning technique, we construct a framework to achieve end-to-end selection and optimization of the design specification of target heterogeneous accelerator, including workload split strategy, mixed-precision quantization scheme, and resource allocation of DSP- and LUT-core. In virtue of the proposed design framework and heterogeneous computing system, our design outperforms the state-of-the-art Mix&Match design with latency reduced by 1.12-1.32x with higher inference accuracy. The N3H-core is open-sourced at: https://github.com/elliothe/N3H_Core.

翻译：FPGA加速神经网络的推断是一个受欢迎的选择,因为FPGA的重新配置和高性能计算能力自然满足了快速演进神经算法的计算需求。然而,FPGA(例如Xilinx DPU)上流行的神经加速器主要利用DSP资源来建造其处理器,而丰富的LUT资源没有得到很好的利用。通过软件硬件硬件/共同设计方法,在这项工作中,我们开发了一个基于FPGA的加速神经网络的混合计算系统。从硬件角度,拟议的加速器由基于DSP和LUT的GE-MUMU计算核心组成。DSP和基于LUT的GEMM核心资源,通过W.r.r.t一个统一的指令设置架构(ISA)和统一的缓冲框架,在神经网络的加速路径上的数据流中,从数字系统(包括数字系统)的升级/完全升级战略分配, 混合的系统(GEMMA) 设计核心,通过数字系统(包括数字系统)的升级和数字系统(MD)的升级,在数据库中,在数据库中,在数据库中,通过数字(包括数字-RIMA-RD)的利用中,在两个部分中,通过数字-RD-RD)的计算,将数据流的计算,在数字-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SDL-SD-SD-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-D-D-S-D-S-S-S-D-D-D-D-D-D-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-D-D-D

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

图神经网络综述

专知会员服务

206+阅读 · 2022年1月9日