A High-performance, Energy-efficient Modular DMA Engine Architecture - 专知论文

会员服务 ·

0

Performer · Engineering · 复合数据 · 相互独立的 · 评论员 ·

2023 年 5 月 9 日

A High-performance, Energy-efficient Modular DMA Engine Architecture

翻译：暂无翻译

Thomas Benz,Michael Rogenmoser,Paul Scheffler,Samuel Riedel,Alessandro Ottaviano,Andreas Kurth,Torsten Hoefler,Luca Benini

from arxiv, 14 pages, 14 figures, submitted to an IEEE journal for possible publication

Data transfers are essential in today's computing systems as latency and complex memory access patterns are increasingly challenging to manage. Direct memory access engines (DMAEs) are critically needed to transfer data independently of the processing elements, hiding latency and achieving high throughput even for complex access patterns to high-latency memory. With the prevalence of heterogeneous systems, DMAEs must operate efficiently in increasingly diverse environments. This work proposes a modular and highly configurable open-source DMAE architecture called intelligent DMA (iDMA), split into three parts that can be composed and customized independently. The front-end implements the control plane binding to the surrounding system. The mid-end accelerates complex data transfer patterns such as multi-dimensional transfers, scattering, or gathering. The back-end interfaces with the on-chip communication fabric (data plane). We assess the efficiency of iDMA in various instantiations: In high-performance systems, we achieve speedups of up to 15.8x with only 1 % additional area compared to a base system without a DMAE. We achieve an area reduction of 10 % while improving ML inference performance by 23 % in ultra-low-energy edge AI systems over an existing DMAE solution. We provide area, timing, latency, and performance characterization to guide its instantiation in various systems.

翻译：暂无翻译

0

相关内容

Performer

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

机器之心

0+阅读 · 2022年10月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

多模协同增强NaREF4(RE=Y,Gd)上转换发光纳米材料的发光效率

国家自然科学基金

0+阅读 · 2015年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于纳米发电机的自驱动MEMS/NEMS机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

传感器网络能量有效空中重编程协议研究

国家自然科学基金

1+阅读 · 2014年12月31日

超500MPa级高强钢螺栓连接受剪性能与设计方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

微纳结构铌酸锂波导高速电光调制器

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

基于模型-传感器信息融合的典型液压设备故障预测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于移动网格的局部间断Galerkin有限元方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

桃果实中脱落酸受体ABAR的功能研究

国家自然科学基金

0+阅读 · 2009年12月31日

High-Speed Area-Efficient Hardware Architecture for the Efficient Detection of Faults in a Bit-Parallel Multiplier Utilizing the Polynomial Basis of GF(2m)

Arxiv

0+阅读 · 2023年6月23日

Don't Treat the Symptom, Find the Cause! Efficient Artificial-Intelligence Methods for (Interactive) Debugging

Arxiv

0+阅读 · 2023年6月22日

A Hierarchical Approach to exploiting Multiple Datasets from TalkBank

Arxiv

0+阅读 · 2023年6月21日

Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Arxiv

0+阅读 · 2023年6月21日

Supporting Construction and Architectural Visualization through BIM and AR/VR: A Systematic Literature Review

Arxiv

0+阅读 · 2023年6月21日

Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning

Arxiv

0+阅读 · 2023年6月21日

A Survey on Automated Driving System Testing: Landscapes and Trends

Arxiv

12+阅读 · 2022年6月13日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Arxiv

39+阅读 · 2019年1月17日

Neural Architecture Search: A Survey

Arxiv

12+阅读 · 2018年9月5日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《军事域人工智能风险、机遇与治理战略指导报告》2025最新76页报告

《杀伤网与精确规模：智能饱和战争时代的战略要务-印度视角》2025最新报告

俄乌冲突的地缘政治与军事教训（万字长文）

《弹药快速效能建模：推进互操作性与技术优势》2025最新26页报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

机器之心

0+阅读 · 2022年10月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

High-Speed Area-Efficient Hardware Architecture for the Efficient Detection of Faults in a Bit-Parallel Multiplier Utilizing the Polynomial Basis of GF(2m)

Arxiv

0+阅读 · 2023年6月23日

Don't Treat the Symptom, Find the Cause! Efficient Artificial-Intelligence Methods for (Interactive) Debugging

Arxiv

0+阅读 · 2023年6月22日

A Hierarchical Approach to exploiting Multiple Datasets from TalkBank

Arxiv

0+阅读 · 2023年6月21日

Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Arxiv

0+阅读 · 2023年6月21日

Supporting Construction and Architectural Visualization through BIM and AR/VR: A Systematic Literature Review

Arxiv

0+阅读 · 2023年6月21日

Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning

Arxiv

0+阅读 · 2023年6月21日

A Survey on Automated Driving System Testing: Landscapes and Trends

Arxiv

12+阅读 · 2022年6月13日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Arxiv

39+阅读 · 2019年1月17日

Neural Architecture Search: A Survey

Arxiv

12+阅读 · 2018年9月5日

相关基金

多模协同增强NaREF4(RE=Y,Gd)上转换发光纳米材料的发光效率

国家自然科学基金

0+阅读 · 2015年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于纳米发电机的自驱动MEMS/NEMS机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

传感器网络能量有效空中重编程协议研究

国家自然科学基金

1+阅读 · 2014年12月31日

超500MPa级高强钢螺栓连接受剪性能与设计方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

微纳结构铌酸锂波导高速电光调制器

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

基于模型-传感器信息融合的典型液压设备故障预测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于移动网格的局部间断Galerkin有限元方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

桃果实中脱落酸受体ABAR的功能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员