FPGA 以FPGA为基础的神经网络加速器 (FPGA Based Accelerator for Neural Networks Computation with Flexible Pipelining) - 专知论文

会员服务 ·

0

FPGA · Neural Networks · Performer · 优化器 · Networking ·

2021 年 12 月 28 日

FPGA Based Accelerator for Neural Networks Computation with Flexible Pipelining

翻译：FPGA 以FPGA为基础的神经网络加速器

Qingyang Yi,Heming Sun,Masahiro Fujita

from arxiv, 6 pages

FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an FPGA-based neural networks accelerator and its optimization framework, which can achieve optimal efficiency for various CNN models and FPGA resources. Targeting high throughput, we adopt layer-wise pipeline architecture for higher DSP utilization. To get the optimal performance, a flexible algorithm to allocate balanced hardware resources to each layer is also proposed, supported by activation buffer design. Through our well-balanced implementation of four CNN models on ZC706, the DSP utilization and efficiency are over 90%. For VGG16 on ZC706, the proposed accelerator achieves the performance of 2.58x, 1.53x and 1.35x better than the referenced non-pipeline architecture [1], pipeline architecture [2] and [3], respectively.

翻译：由于高功率和可配置性,FPGA适合用于固定点神经网络计算,然而,必须大力改进设计,利用有限的硬件资源实现高性能。我们提出了一个基于FPGA的神经网络加速器及其优化框架,可以实现有线电视新闻网各种模型和FPGA资源的最佳效率。针对高输送量,我们采用了高层次的管道结构,以便提高DSP的利用率。为了取得最佳性能,还提出了向每一层分配平衡的硬件资源的灵活算法,辅之以启动缓冲设计。通过在ZC706上均衡地实施四个CNN模型,DSP的利用率和效率超过90%。对于ZC706上的VGG16, 拟议的加速器的性能分别比引用的非管道结构[1]、管道结构[2]和[3]好2.58x、1.53x和1.35x。

0

相关内容

FPGA

FPGA：ACM/SIGDA International Symposium on Field-Programmable Gate Arrays。 Explanation：ACM/SIGDA现场可编程门阵列国际研讨会。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/fpga/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

48+阅读 · 2020年2月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

精选两篇最新AutoML推荐系统综述

精选两篇最新AutoML推荐系统综述

机器学习与推荐算法

5+阅读 · 2022年4月7日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

图的谱方法及其在纳米尺度集成电路分析优化中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于固态硬盘的存储系统性能分析模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

高能效FPGA高层次综合研究

国家自然科学基金

2+阅读 · 2013年12月31日

面向复杂产品制造公差的并行可靠性设计优化研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于TSV互连的三维FPGA架构及关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

离散最优传输问题，闵可夫斯基问题和蒙奇－安培方程中的变分原理和Power图

国家自然科学基金

1+阅读 · 2012年12月31日

云计算环境下供应链协调机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

布尔函数的密码性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

典型材料多尺度耦合电磁模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于FPGA+ARM的电力谐波检测方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

Distributed Learning of Deep Neural Networks using Independent Subnet Training

Arxiv

2+阅读 · 2022年4月18日

BA-Net: Bridge Attention for Deep Convolutional Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Server Free Wireless Federated Learning: Architecture, Algorithm, and Analysis

Arxiv

0+阅读 · 2022年4月15日

An Energy-Efficient and Runtime-Reconfigurable FPGA-Based Accelerator for Robotic Localization Systems

Arxiv

0+阅读 · 2022年4月15日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Simplifying Graph Convolutional Networks

Simplifying Graph Convolutional Networks

Arxiv

12+阅读 · 2019年2月19日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

48+阅读 · 2020年2月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

精选两篇最新AutoML推荐系统综述

精选两篇最新AutoML推荐系统综述

机器学习与推荐算法

5+阅读 · 2022年4月7日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Distributed Learning of Deep Neural Networks using Independent Subnet Training

Arxiv

2+阅读 · 2022年4月18日

BA-Net: Bridge Attention for Deep Convolutional Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Server Free Wireless Federated Learning: Architecture, Algorithm, and Analysis

Arxiv

0+阅读 · 2022年4月15日

An Energy-Efficient and Runtime-Reconfigurable FPGA-Based Accelerator for Robotic Localization Systems

Arxiv

0+阅读 · 2022年4月15日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Simplifying Graph Convolutional Networks

Simplifying Graph Convolutional Networks

Arxiv

12+阅读 · 2019年2月19日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

相关基金

图的谱方法及其在纳米尺度集成电路分析优化中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于固态硬盘的存储系统性能分析模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

高能效FPGA高层次综合研究

国家自然科学基金

2+阅读 · 2013年12月31日

面向复杂产品制造公差的并行可靠性设计优化研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于TSV互连的三维FPGA架构及关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

离散最优传输问题，闵可夫斯基问题和蒙奇－安培方程中的变分原理和Power图

国家自然科学基金

1+阅读 · 2012年12月31日

云计算环境下供应链协调机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

布尔函数的密码性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

典型材料多尺度耦合电磁模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于FPGA+ARM的电力谐波检测方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员