标题：面向FPGA和ASIC的Hessian-aware量化神经网络的端到端协同设计摘要：我们开发了一种用于有效的FPGA和ASIC硬件的协同设计神经网络（NN）的训练和实现端到端工作流程。我们的方法利用Hessian-aware量化（HAWQ）的NN，Quantized Open Neural Network Exchange（QONNX）中间表示和hls4ml工具流，将NN转换为FPGA和ASIC固件。这使得在单个开源工作流程中，非专家就可以将高效的NN实现在硬件上，并在一系列科学和工业场景中部署实时机器学习应用程序。我们展示了用于粒子物理学应用的工作流程，其中涉及要在CERN Large Hadron Collider（LHC）的40 MHz碰撞率下进行触发决策的应用。鉴于高碰撞率，所有数据处理都必须在严格的区域和延迟内在定制的ASIC和FPGA硬件上实现。根据这些约束，我们为模拟LHC质子-质子碰撞中高动量粒子喷注实现了优化的混合精度NN分类器。 (End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs)

翻译：标题：面向FPGA和ASIC的Hessian-aware量化神经网络的端到端协同设计摘要：我们开发了一种用于有效的FPGA和ASIC硬件的协同设计神经网络（NN）的训练和实现端到端工作流程。我们的方法利用Hessian-aware量化（HAWQ）的NN，Quantized Open Neural Network Exchange（QONNX）中间表示和hls4ml工具流，将NN转换为FPGA和ASIC固件。这使得在单个开源工作流程中，非专家就可以将高效的NN实现在硬件上，并在一系列科学和工业场景中部署实时机器学习应用程序。我们展示了用于粒子物理学应用的工作流程，其中涉及要在CERN Large Hadron Collider（LHC）的40 MHz碰撞率下进行触发决策的应用。鉴于高碰撞率，所有数据处理都必须在严格的区域和延迟内在定制的ASIC和FPGA硬件上实现。根据这些约束，我们为模拟LHC质子-质子碰撞中高动量粒子喷注实现了优化的混合精度NN分类器。

Javier Campos,Zhen Dong,Javier Duarte,Amir Gholami,Michael W. Mahoney,Jovan Mitrevski,Nhan Tran

from arxiv, 19 pages, 6 figures, 2 tables

We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs) for efficient field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) hardware. Our approach leverages Hessian-aware quantization (HAWQ) of NNs, the Quantized Open Neural Network Exchange (QONNX) intermediate representation, and the hls4ml tool flow for transpiling NNs into FPGA and ASIC firmware. This makes efficient NN implementations in hardware accessible to nonexperts, in a single open-sourced workflow that can be deployed for real-time machine learning applications in a wide range of scientific and industrial settings. We demonstrate the workflow in a particle physics application involving trigger decisions that must operate at the 40 MHz collision rate of the CERN Large Hadron Collider (LHC). Given the high collision rate, all data processing must be implemented on custom ASIC and FPGA hardware within a strict area and latency. Based on these constraints, we implement an optimized mixed-precision NN classifier for high-momentum particle jets in simulated LHC proton-proton collisions.

翻译：注意：英文专有名词未进行翻译，需使用英文标识。

相关内容

FPGA

关注 18

FPGA：ACM/SIGDA International Symposium on Field-Programmable Gate Arrays。 Explanation：ACM/SIGDA现场可编程门阵列国际研讨会。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/fpga/

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日