高效率地汇编和绘制固定功能组合逻辑与数字信号处理器的固定功能合并逻辑图 (Efficient Compilation and Mapping of Fixed Function Combinational Logic onto Digital Signal Processors Targeting Neural Network Inference and Utilizing High-level Synthesis)

泛函 · network inference · 推断 · Performer · Neural Networks ·

2022 年 7 月 30 日

Efficient Compilation and Mapping of Fixed Function Combinational Logic onto Digital Signal Processors Targeting Neural Network Inference and Utilizing High-level Synthesis

翻译：高效率地汇编和绘制固定功能组合逻辑与数字信号处理器的固定功能合并逻辑图

Soheil Nazar Shahsavani,Arash Fayyazi,Mahdi Nazemi,Massoud Pedram

from arxiv, 25 page, 10 figures. Under review

Recent efforts for improving the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed function combinational logic. Mapping such large Boolean functions with many input variables and product terms to digital signal processors (DSPs) on Field-programmable gate arrays (FPGAs) needs a novel framework considering the structure and the reconfigurability of DSP blocks during this process. The proposed methodology in this paper maps the fixed function combinational logic blocks to a set of Boolean functions where Boolean operations corresponding to each function are mapped to DSP devices rather than look-up tables (LUTs) on the FPGAs to take advantage of the high performance, low latency, and parallelism of DSP blocks. % This paper also presents an innovative design and optimization methodology for compilation and mapping of NNs, utilizing fixed function combinational logic to DSPs on FPGAs employing high-level synthesis flow. % Our experimental evaluations across several \REVone{datasets} and selected NNs demonstrate the comparable performance of our framework in terms of the inference latency and output accuracy compared to prior art FPGA-based NN accelerators employing DSPs.

翻译：最近为改进满足当今应用要求的神经网络加速器(NN)的性能而作出的努力产生了一个新的趋势,即基于逻辑的NNN推推法依赖固定功能组合逻辑逻辑推论。将这样的大型布林函数与许多输入变量和产品术语映射成数字信号处理器(DSPs),以利用外地可编程门阵列(FPGAs)的高性能、低弹性和平行性能。%本文件还提出了一个新的框架,其中考虑到在此过程中DSP块的结构和可重新配置。本文中的拟议方法将固定功能组合逻辑块绘制成一套布林函数,将每个功能对应的布林操作绘制成DSP设备,而不是FPGAs上的外观表(LUTs),以利用外地可编程门阵列的高性、低耐久性和平行性能处理器。% 本文件还利用固定功能组合逻辑与使用高水平合成流程的FPGAs上的DSPs。%我们在几个DREVener 上进行的实验性评估性评估,并选择了前FPS的可比较性能框架。