现实世界深心神经网络灵活从端向端向端推导的异质内计量电子计算组 (A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End Inference of Real-World Deep Neural Networks) - 专知论文

会员服务 ·

0

IMC · 端到端 · 推断 · 簇 · DNN ·

2022 年 1 月 4 日

A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End Inference of Real-World Deep Neural Networks

翻译：现实世界深心神经网络灵活从端向端向端推导的异质内计量电子计算组

Angelo Garofalo,Gianmarco Ottavi,Francesco Conti,Geethan Karunaratne,Irem Boybat,Luca Benini,Davide Rossi

from arxiv, 14 pages (not including final biography page), 13 figures (excluded authors pictures)

Deployment of modern TinyML tasks on small battery-constrained IoT devices requires high computational energy efficiency. Analog In-Memory Computing (IMC) using non-volatile memory (NVM) promises major efficiency improvements in deep neural network (DNN) inference and serves as on-chip memory storage for DNN weights. However, IMC's functional flexibility limitations and their impact on performance, energy, and area efficiency are not yet fully understood at the system level. To target practical end-to-end IoT applications, IMC arrays must be enclosed in heterogeneous programmable systems, introducing new system-level challenges which we aim at addressing in this work. We present a heterogeneous tightly-coupled clustered architecture integrating 8 RISC-V cores, an in-memory computing accelerator (IMA), and digital accelerators. We benchmark the system on a highly heterogeneous workload such as the Bottleneck layer from a MobileNetV2, showing 11.5x performance and 9.5x energy efficiency improvements, compared to highly optimized parallel execution on the cores. Furthermore, we explore the requirements for end-to-end inference of a full mobile-grade DNN (MobileNetV2) in terms of IMC array resources, by scaling up our heterogeneous architecture to a multi-array accelerator. Our results show that our solution, on the end-to-end inference of the MobileNetV2, is one order of magnitude better in terms of execution latency than existing programmable architectures and two orders of magnitude better than state-of-the-art heterogeneous solutions integrating in-memory computing analog cores.

翻译：使用非挥发性内存(NVM)的模拟模拟计算(IMC)将大大提高深神经网络(DNN)的推断效率,并用作DNN重量的在芯片内存储器。然而,IMC的功能灵活性限制及其对性能、能量和地区效率的影响尚未在系统一级完全理解。要将IMC阵列嵌入实用端对端的IOT应用程序,就必须将IMC阵列嵌入可编程系统,引入我们在此工作中要解决的新的系统级命令。我们展示了一个混合的集束结构,将8个RIRC-V核心(DNNNN)结合到芯内存储存储存储器(IMA)和数字加速器。我们用一个高度分散的工作量来衡量系统,例如从移动式NetV2到波特列端端端端,显示现有业绩和9.5x能源效率的改进,与在核心端端端端端解决方案中进行高度优化的同步执行。此外,我们在移动式结构中将一个更精确的IMC-级核心系统要求到升级到升级。

0

相关内容

IMC

IMC：Internet Measurement Conference。 Explanation：互联网测量会议。 Publisher：ACM/USENIX。 SIT： http://dblp.uni-trier.de/db/conf/imc/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【教程】深度学习Keras与TensorFlow教程，Deep Learning with Keras and Tensorflow in R

【教程】深度学习Keras与TensorFlow教程，Deep Learning with Keras and Tensorflow in R

专知会员服务

32+阅读 · 2022年3月9日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

论文周报 | 推荐系统领域最新研究进展

论文周报 | 推荐系统领域最新研究进展

机器学习与推荐算法

2+阅读 · 2022年4月11日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

复杂网络传播动力学的数学分析

国家自然科学基金

6+阅读 · 2013年12月31日

用系统生物学方法对miRNAs介导的癌细胞死亡网络的研究

国家自然科学基金

0+阅读 · 2013年12月31日

含有不确定性的复杂系统高效仿真及其评估方法

国家自然科学基金

1+阅读 · 2013年12月31日

动力系统的可积、分支与嵌入流

国家自然科学基金

0+阅读 · 2012年12月31日

信息物理电力系统耦合网络动态的分解协调仿真方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

不确定多智能体系统一致性的分布式自适应控制

国家自然科学基金

7+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

扩展的模糊逻辑与基于蕴涵算子的Rough逻辑

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

光基因调控脊髓损伤小鼠步行CPG研究

国家自然科学基金

0+阅读 · 2011年12月31日

Safe Control with Neural Network Dynamic Models

Arxiv

0+阅读 · 2022年4月20日

Scheduling Coflows for Minimizing the Total Weighted Completion Time in Heterogeneous Parallel Networks

Arxiv

0+阅读 · 2022年4月16日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Dynamic Neural Networks: A Survey

Arxiv

37+阅读 · 2021年2月10日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【教程】深度学习Keras与TensorFlow教程，Deep Learning with Keras and Tensorflow in R

【教程】深度学习Keras与TensorFlow教程，Deep Learning with Keras and Tensorflow in R

专知会员服务

32+阅读 · 2022年3月9日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

美陆军五大转型方向

一种Agent自主性风险评估框架 | 最新文献

实时无人机指令处理：一种面向无人机系统的大语言模型方法

基于动态知识图谱的人工智能代理自主研究周期 | 文献

相关资讯

论文周报 | 推荐系统领域最新研究进展

论文周报 | 推荐系统领域最新研究进展

机器学习与推荐算法

2+阅读 · 2022年4月11日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Safe Control with Neural Network Dynamic Models

Arxiv

0+阅读 · 2022年4月20日

Scheduling Coflows for Minimizing the Total Weighted Completion Time in Heterogeneous Parallel Networks

Arxiv

0+阅读 · 2022年4月16日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Dynamic Neural Networks: A Survey

Arxiv

37+阅读 · 2021年2月10日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

复杂网络传播动力学的数学分析

国家自然科学基金

6+阅读 · 2013年12月31日

用系统生物学方法对miRNAs介导的癌细胞死亡网络的研究

国家自然科学基金

0+阅读 · 2013年12月31日

含有不确定性的复杂系统高效仿真及其评估方法

国家自然科学基金

1+阅读 · 2013年12月31日

动力系统的可积、分支与嵌入流

国家自然科学基金

0+阅读 · 2012年12月31日

信息物理电力系统耦合网络动态的分解协调仿真方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

不确定多智能体系统一致性的分布式自适应控制

国家自然科学基金

7+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

扩展的模糊逻辑与基于蕴涵算子的Rough逻辑

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

光基因调控脊髓损伤小鼠步行CPG研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员