Deep Neural Networks (DNNs) have transformed the field of machine learning and are widely deployed in many applications involving image, video, speech and natural language processing. The increasing compute demands of DNNs have been widely addressed through Graphics Processing Units (GPUs) and specialized accelerators. However, as model sizes grow, these von Neumann architectures require very high memory bandwidth to keep the processing elements utilized as a majority of the data resides in the main memory. Processing in memory has been proposed as a promising solution for the memory wall bottleneck for ML workloads. In this work, we propose a new DRAM-based processing-in-memory (PIM) multiplication primitive coupled with intra-bank accumulation to accelerate matrix vector operations in ML workloads. The proposed multiplication primitive adds < 1% area overhead and does not require any change in the DRAM peripherals. Therefore, the proposed multiplication can be easily adopted in commodity DRAM chips. Subsequently, we design a DRAM-based PIM architecture, data mapping scheme and dataflow for executing DNNs within DRAM. System evaluations performed on networks like AlexNet, VGG16 and ResNet18 show that the proposed architecture, mapping, and data flow can provide up to 23x speedup over an NVIDIA Titan Xp GPU. Furthermore, it achieves upto 6.5x speedup over an ideal von Neumann architecture with infinite computational throughput, highlighting the need to overcome the memory bottleneck in future generations of DNN hardware.
翻译:深心神经网络(DNNS)改造了机器学习领域,并被广泛用于涉及图像、视频、语音和自然语言处理的许多应用软件中。DNN的计算需求不断增加,通过图形处理股(GPUs)和专门的加速器得到了广泛的解决。然而,随着模型规模的扩大,这些冯纽曼建筑需要非常高的记忆带宽,以保持作为大部分数据在主记忆中使用的处理元素。为ML工作量的内存墙内中瓶颈,提出了一个很有希望的解决方案。在此工作中,我们提出了一个新的基于 DRAM 的代内处理-模拟(PIM) 复制,同时通过银行内部积累加速ML工作量的矩阵矢量操作。拟议的倍增原始程序增加了 < 1% 的区域管理费,因此,提议的倍增功能可以很容易在商品 DRAM 芯片中采用。我们设计了一个基于 DRAM 的内存储版硬性硬性硬性数据结构,数据映像计划和数据流化数据流流流流流,用于DNNNFAS 内执行DIM 的DNIS Streal Steal Steal Steal Stall Stall 。此外, 的系统评估可以向X ASymodelfuld Styal Styal AStoxxx 23 结构提供一种系统评估, AStox AStox AS AS AS to suplup 。