PIM-DRAM:利用商品记录和记录管理处理加快机器学习工作量 (PIM-DRAM: Accelerating Machine Learning Workloads using Processing in Commodity DRAM) - 专知论文

会员服务 ·

0

Processing（编程语言） · 冯 · 诺伊曼架构 · Machine Learning · 学成 · Networking ·

2021 年 5 月 14 日

PIM-DRAM: Accelerating Machine Learning Workloads using Processing in Commodity DRAM

翻译：PIM-DRAM:利用商品记录和记录管理处理加快机器学习工作量

Sourjya Roy,Mustafa Ali,Anand Raghunathan

Deep Neural Networks (DNNs) have transformed the field of machine learning and are widely deployed in many applications involving image, video, speech and natural language processing. The increasing compute demands of DNNs have been widely addressed through Graphics Processing Units (GPUs) and specialized accelerators. However, as model sizes grow, these von Neumann architectures require very high memory bandwidth to keep the processing elements utilized as a majority of the data resides in the main memory. Processing in memory has been proposed as a promising solution for the memory wall bottleneck for ML workloads. In this work, we propose a new DRAM-based processing-in-memory (PIM) multiplication primitive coupled with intra-bank accumulation to accelerate matrix vector operations in ML workloads. The proposed multiplication primitive adds < 1% area overhead and does not require any change in the DRAM peripherals. Therefore, the proposed multiplication can be easily adopted in commodity DRAM chips. Subsequently, we design a DRAM-based PIM architecture, data mapping scheme and dataflow for executing DNNs within DRAM. System evaluations performed on networks like AlexNet, VGG16 and ResNet18 show that the proposed architecture, mapping, and data flow can provide up to 23x speedup over an NVIDIA Titan Xp GPU. Furthermore, it achieves upto 6.5x speedup over an ideal von Neumann architecture with infinite computational throughput, highlighting the need to overcome the memory bottleneck in future generations of DNN hardware.

翻译：深心神经网络(DNNS)改造了机器学习领域,并被广泛用于涉及图像、视频、语音和自然语言处理的许多应用软件中。DNN的计算需求不断增加,通过图形处理股(GPUs)和专门的加速器得到了广泛的解决。然而,随着模型规模的扩大,这些冯纽曼建筑需要非常高的记忆带宽,以保持作为大部分数据在主记忆中使用的处理元素。为ML工作量的内存墙内中瓶颈,提出了一个很有希望的解决方案。在此工作中,我们提出了一个新的基于 DRAM 的代内处理-模拟(PIM) 复制,同时通过银行内部积累加速ML工作量的矩阵矢量操作。拟议的倍增原始程序增加了 < 1% 的区域管理费,因此,提议的倍增功能可以很容易在商品 DRAM 芯片中采用。我们设计了一个基于 DRAM 的内存储版硬性硬性硬性数据结构,数据映像计划和数据流化数据流流流流流,用于DNNNFAS 内执行DIM 的DNIS Streal Steal Steal Steal Stall Stall 。此外, 的系统评估可以向X ASymodelfuld Styal Styal AStoxxx 23 结构提供一种系统评估, AStox AStox AS AS AS to suplup 。

0

相关内容

Processing（编程语言）

Processing（编程语言）

Processing 是一门开源编程语言和与之配套的集成开发环境（IDE）的名称。Processing 在电子艺术和视觉设计社区被用来教授编程基础，并运用于大量的新媒体和互动艺术作品中。

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

专知会员服务

9+阅读 · 2020年5月15日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

专知会员服务

67+阅读 · 2019年12月27日

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

专知会员服务

43+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

专知会员服务

13+阅读 · 2019年8月27日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知

49+阅读 · 2020年8月28日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

论文 | CIKM2017 最佳论文鉴赏

论文 | CIKM2017 最佳论文鉴赏

机器学习研究会

4+阅读 · 2017年12月19日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

An Evaluation of Machine Learning and Deep Learning Models for Drought Prediction using Weather Data

An Evaluation of Machine Learning and Deep Learning Models for Drought Prediction using Weather Data

Arxiv

0+阅读 · 2021年7月6日

A Deep Learning Object Detection Method for an Efficient Clusters Initialization

Arxiv

0+阅读 · 2021年7月4日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Arxiv

14+阅读 · 2019年12月27日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

dynnode2vec: Scalable Dynamic Network Embedding

dynnode2vec: Scalable Dynamic Network Embedding

Arxiv

14+阅读 · 2018年12月6日

Large-Scale Learnable Graph Convolutional Networks

Arxiv

3+阅读 · 2018年8月12日

Scaling Neural Machine Translation

Arxiv

3+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

Processing（编程语言）

冯 · 诺伊曼架构

Machine Learning

相关VIP内容

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

专知会员服务

9+阅读 · 2020年5月15日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

专知会员服务

67+阅读 · 2019年12月27日

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

专知会员服务

43+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

专知会员服务

13+阅读 · 2019年8月27日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【KDD2020-Tutorial】深度学习异常检测，180页ppt

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知

49+阅读 · 2020年8月28日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

论文 | CIKM2017 最佳论文鉴赏

论文 | CIKM2017 最佳论文鉴赏

机器学习研究会

4+阅读 · 2017年12月19日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

相关论文

An Evaluation of Machine Learning and Deep Learning Models for Drought Prediction using Weather Data

An Evaluation of Machine Learning and Deep Learning Models for Drought Prediction using Weather Data

Arxiv

0+阅读 · 2021年7月6日

A Deep Learning Object Detection Method for an Efficient Clusters Initialization

Arxiv

0+阅读 · 2021年7月4日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Arxiv

14+阅读 · 2019年12月27日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

dynnode2vec: Scalable Dynamic Network Embedding

dynnode2vec: Scalable Dynamic Network Embedding

Arxiv

14+阅读 · 2018年12月6日

Large-Scale Learnable Graph Convolutional Networks

Arxiv

3+阅读 · 2018年8月12日

Scaling Neural Machine Translation

Arxiv

3+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员