FPGA 的低长期 GNN 推理重叠加速器 (GraphAGILE: An FPGA-based Overlay Accelerator for Low-latency GNN Inference) - 专知论文

会员服务 ·

0

可约的 · 推断 · GNN · 编译器 · FPGA ·

2023 年 2 月 2 日

GraphAGILE: An FPGA-based Overlay Accelerator for Low-latency GNN Inference

翻译：FPGA 的低长期 GNN 推理重叠加速器

Bingyi Zhang,Hanqing Zeng,Viktor Prasanna

from arxiv, 16pages

This paper presents GraphAGILE, a domain-specific FPGA-based overlay accelerator for graph neural network (GNN) inference. GraphAGILE consists of (1) \emph{a novel unified architecture design} with an \emph{instruction set}, and (2) \emph{a compiler} built upon the instruction set that can quickly generate optimized code. Due to the proposed instruction set architecture (ISA) and the compiler, GraphAGILE does not require any FPGA reconfiguration when performing inference on various GNN models and input graphs. For the architecture design, we propose a novel hardware module named Adaptive Computation Kernel (ACK), that can execute various computation kernels of GNNs, including general matrix multiplication (GEMM), sparse-dense matrix multiplication (SpDMM) and sampled dense-dense matrix multiplication (SDDMM). The compiler takes the specifications of a GNN model and the graph meta data (e.g., the number of vertices and edges) as input, and generates a sequence of instructions for inference execution. We develop the following compiler optimizations to reduce inference latency: (1) computation order optimization that automatically reorders the computation graph to reduce the total computation complexity, (2) layer fusion that merges adjacent layers to reduce data communication volume, (3) data partitioning with a partition-centric execution scheme that partitions the input graph to fit the available on-chip memory of FPGA, (4) kernel mapping that automatically selects execution mode for ACK, and performs task scheduling to overlap computation with data communication and achieves dynamic load balance. We implement GraphAGILE on a state-of-the-art FPGA platform, Xilinx Alveo U250. GraphAGILE can execute widely used GNN models, including GCN, GAT, GIN, GraphSAGE, SGC and other GNN models supported by GraphGym.

翻译：本文展示了基于图形神经网络( GNN) 的基于域的 FGGA 的超升加速器 GrapAGILE 。用于图形神经网络( GNN ) 的推断。 GrapAGILE 包含 (1) \ emph{ a 新的统一建筑设计} 和 (2) \ emph{ a 编译器基于能够快速生成优化代码的教学集。由于拟议的指令集架构( ISA) 和编译器, GrapAGILE 在对各种 GNN 模型和输入图形进行推断时不需要 FPGAGAGA 重叠。对于建筑设计,我们提议了一个名为调制调制调制内核( AK) 的新硬件模块,这个模块可以执行 GNGNT 的计算内核增增量( GDMMM ) 和样本密度矩阵增量增量。编译器将GNNNFI 的内存模型和平面元数据调数据转换为优化执行流程。

0

相关内容

可约的

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

精彩活动丨AI for Graph Computation学术研讨会

精彩活动丨AI for Graph Computation学术研讨会

图与推荐

1+阅读 · 2022年7月16日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

多尺度地图数据间不一致性同化建模与处理方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

GPU通用计算系统检查点方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

面向基于图的数据挖掘的FPGA加速方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

变幅值多轴载荷工况下橡胶隔振器疲劳特性的试验与建模方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

百脉根AP2/ERF转录因子LcSRA1耐盐胁迫应答的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

控制有机半导体材料分子按照face-on 方式排列的高性能薄膜晶体管的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于机器学习的线程级推测模型和编译优化方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

Arxiv

0+阅读 · 2023年3月27日

GPU-accelerated Matrix Cover Algorithm for Multiple Patterning Layout Decomposition

Arxiv

0+阅读 · 2023年3月25日

Honeycomb: ordered key-value store acceleration on an FPGA-based SmartNIC

Arxiv

0+阅读 · 2023年3月24日

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

Arxiv

0+阅读 · 2023年3月24日

Gradient scarcity with Bilevel Optimization for Graph Learning

Arxiv

0+阅读 · 2023年3月24日

Dynasparse: Accelerating GNN Inference through Dynamic Sparsity Exploitation

Arxiv

0+阅读 · 2023年3月22日

Survey on Graph Neural Network Acceleration: An Algorithmic Perspective

Arxiv

12+阅读 · 2022年2月10日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

Simplifying Graph Convolutional Networks

Simplifying Graph Convolutional Networks

Arxiv

12+阅读 · 2019年2月19日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

精彩活动丨AI for Graph Computation学术研讨会

精彩活动丨AI for Graph Computation学术研讨会

图与推荐

1+阅读 · 2022年7月16日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

Arxiv

0+阅读 · 2023年3月27日

GPU-accelerated Matrix Cover Algorithm for Multiple Patterning Layout Decomposition

Arxiv

0+阅读 · 2023年3月25日

Honeycomb: ordered key-value store acceleration on an FPGA-based SmartNIC

Arxiv

0+阅读 · 2023年3月24日

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

Arxiv

0+阅读 · 2023年3月24日

Gradient scarcity with Bilevel Optimization for Graph Learning

Arxiv

0+阅读 · 2023年3月24日

Dynasparse: Accelerating GNN Inference through Dynamic Sparsity Exploitation

Arxiv

0+阅读 · 2023年3月22日

Survey on Graph Neural Network Acceleration: An Algorithmic Perspective

Arxiv

12+阅读 · 2022年2月10日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

Simplifying Graph Convolutional Networks

Simplifying Graph Convolutional Networks

Arxiv

12+阅读 · 2019年2月19日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

多尺度地图数据间不一致性同化建模与处理方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

GPU通用计算系统检查点方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

面向基于图的数据挖掘的FPGA加速方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

变幅值多轴载荷工况下橡胶隔振器疲劳特性的试验与建模方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

百脉根AP2/ERF转录因子LcSRA1耐盐胁迫应答的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

控制有机半导体材料分子按照face-on 方式排列的高性能薄膜晶体管的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于机器学习的线程级推测模型和编译优化方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员