增加 FPGA 加速器内存带宽宽度, 并配有最便利的内存布局 (Increasing FPGA Accelerators Memory Bandwidth with a Burst-Friendly Memory Layout) - 专知论文

会员服务 ·

0

FPGA · Performance · 核化 · 编译器 · 计算机体系架构 ·

2022 年 2 月 11 日

Increasing FPGA Accelerators Memory Bandwidth with a Burst-Friendly Memory Layout

翻译：增加 FPGA 加速器内存带宽宽度, 并配有最便利的内存布局

Corentin Ferry,Tomofumi Yuki,Steven Derrien,Sanjay Rajopadhye

from arxiv, 15 pages; 17 figures

Offloading compute-intensive kernels to hardware accelerators relies on the large degree of parallelism offered by these platforms. However, the effective bandwidth of the memory interface often causes a bottleneck, hindering the accelerator's effective performance. Techniques enabling data reuse, such as tiling, lower the pressure on memory traffic but still often leave the accelerators I/O-bound. A further increase in effective bandwidth is possible by using burst rather than element-wise accesses, provided the data is contiguous in memory. In this paper, we propose a memory allocation technique, and provide a proof-of-concept source-to-source compiler pass, that enables such burst transfers by modifying the data layout in external memory. We assess how this technique pushes up the memory throughput, leaving room for exploiting additional parallelism, for a minimal logic overhead.

翻译：将计算密集的内核卸载到硬件加速器上,取决于这些平台提供的大量平行功能。但是,内存界面的有效带宽往往造成瓶颈,妨碍加速器的有效性能。使数据再利用的技术,例如平铺,降低了对记忆传输的压力,但仍然经常离开加速器I/O-就绪。如果数据在记忆中相互连接,则使用爆破而非元素智能存取,就可以进一步提高有效带宽。在本文中,我们建议采用记忆分配技术,并提供源对源校准编译器,通过修改外部记忆中的数据布局,使这种爆发性传输成为可能。我们评估这一技术如何将记忆推高,为利用额外的平行功能留下空间,以获得最低限度的逻辑管理。

0

相关内容

FPGA

FPGA：ACM/SIGDA International Symposium on Field-Programmable Gate Arrays。 Explanation：ACM/SIGDA现场可编程门阵列国际研讨会。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/fpga/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NUS-Xavier教授】注意力神经网络，79页ppt

【NUS-Xavier教授】注意力神经网络，79页ppt

专知会员服务

65+阅读 · 2021年11月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

介孔材料-氧化石墨烯复合纳米粒子协同增强改性不饱和聚酯复合材料的研究

国家自然科学基金

0+阅读 · 2012年12月31日

弹性复合材料中偏微分方程组的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Stackelberg博弈的我国温室气体减排对策研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪间充质干细胞向限定性内胚层细胞重编程过程中长链非编码RNA调控作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化石墨烯对水泥基复合材料微观结构的调控机制及与性能的相关性

国家自然科学基金

0+阅读 · 2012年12月31日

稀土掺杂对Co基Heusler合金磁性和费米能级的调控

国家自然科学基金

0+阅读 · 2011年12月31日

氧化石墨烯-TLCP/酚醛树脂基微纳米复合材料的结构与摩擦学性能的关系研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

高速流中物面热流和摩阻计算的物理准则研究

国家自然科学基金

0+阅读 · 2009年12月31日

Verified Compilation of Quantum Oracles

Verified Compilation of Quantum Oracles

Arxiv

0+阅读 · 2022年4月20日

Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

Arxiv

0+阅读 · 2022年4月19日

Bodyless Block Propagation: TPS Fully Scalable Blockchain with Pre-Validation

Arxiv

0+阅读 · 2022年4月19日

Network Bandwidth Variation-Adapted State Transfer for Geo-Replicated State Machines and its Application to Dynamic Replica Replacement

Arxiv

0+阅读 · 2022年4月19日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

Nanorobot queue: Cooperative treatment of cancer based on team member communication and image processing

Nanorobot queue: Cooperative treatment of cancer based on team member communication and image processing

Arxiv

0+阅读 · 2022年4月15日

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

Arxiv

0+阅读 · 2022年4月15日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Arxiv

0+阅读 · 2022年4月15日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

计算机体系架构

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NUS-Xavier教授】注意力神经网络，79页ppt

【NUS-Xavier教授】注意力神经网络，79页ppt

专知会员服务

65+阅读 · 2021年11月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Verified Compilation of Quantum Oracles

Verified Compilation of Quantum Oracles

Arxiv

0+阅读 · 2022年4月20日

Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

Arxiv

0+阅读 · 2022年4月19日

Bodyless Block Propagation: TPS Fully Scalable Blockchain with Pre-Validation

Arxiv

0+阅读 · 2022年4月19日

Network Bandwidth Variation-Adapted State Transfer for Geo-Replicated State Machines and its Application to Dynamic Replica Replacement

Arxiv

0+阅读 · 2022年4月19日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

Nanorobot queue: Cooperative treatment of cancer based on team member communication and image processing

Nanorobot queue: Cooperative treatment of cancer based on team member communication and image processing

Arxiv

0+阅读 · 2022年4月15日

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

Arxiv

0+阅读 · 2022年4月15日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Arxiv

0+阅读 · 2022年4月15日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

相关基金

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

介孔材料-氧化石墨烯复合纳米粒子协同增强改性不饱和聚酯复合材料的研究

国家自然科学基金

0+阅读 · 2012年12月31日

弹性复合材料中偏微分方程组的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Stackelberg博弈的我国温室气体减排对策研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪间充质干细胞向限定性内胚层细胞重编程过程中长链非编码RNA调控作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化石墨烯对水泥基复合材料微观结构的调控机制及与性能的相关性

国家自然科学基金

0+阅读 · 2012年12月31日

稀土掺杂对Co基Heusler合金磁性和费米能级的调控

国家自然科学基金

0+阅读 · 2011年12月31日

氧化石墨烯-TLCP/酚醛树脂基微纳米复合材料的结构与摩擦学性能的关系研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

高速流中物面热流和摩阻计算的物理准则研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员