DRAM-based main memory is used in nearly all computing systems as a major component. One way of overcoming the main memory bottleneck is to move computation near memory, a paradigm known as processing-in-memory (PiM). Recent PiM techniques provide a promising way to improve the performance and energy efficiency of existing and future systems at no additional DRAM hardware cost. We develop the Processing-in-DRAM (PiDRAM) framework, the first flexible, end-to-end, and open source framework that enables system integration studies and evaluation of real PiM techniques using real DRAM chips. We demonstrate a prototype of PiDRAM on an FPGA-based platform (Xilinx ZC706) that implements an open-source RISC-V system (Rocket Chip). To demonstrate the flexibility and ease of use of PiDRAM, we implement two PiM techniques: (1) RowClone, an in-DRAM copy and initialization mechanism (using command sequences proposed by ComputeDRAM), and (2) D-RaNGe, an in-DRAM true random number generator based on DRAM activation-latency failures. Our end-to-end evaluation of RowClone shows up to 14.6X speedup for copy and 12.6X initialization operations over CPU copy (i.e., conventional memcpy) and initialization (i.e., conventional calloc) operations. Our implementation of D-RaNGe provides high throughput true random numbers, reaching 8.30 Mb/s throughput. Over the Verilog and C++ basis provided by PiDRAM, implementing the required hardware and software components, implementing RowClone end-to-end takes 198 (565) and implementing D-RaNGe end-to-end takes 190 (78) lines of Verilog (C++) code. PiDRAM is open sourced on Github: https://github.com/CMU-SAFARI/PiDRAM.
翻译:几乎所有的计算机系统都使用基于 DRAM 的主存储器作为主要组成部分。 克服主存储随机瓶颈的主要存储器的一种方法是移动近存储器的计算。 我们展示了在基于 FPGA 平台( Xilinx ZC706)上运行的 PIM 原型,该原型将实施一个公开源的RISC-V系统( Rocket Chip) 。 为了显示 PRAM 硬件使用的灵活性和方便度,我们开发了两种 PIM 技术:(1) RowClone,这是DRAM 的复制和初始化机制(使用 ComputeDRAM 提议的指令序列),以及(2) DNGE,一个基于 FPGA 平台( XIx ZC706) 的 PDIDRAMDRAM 原型原型( Xlock) 初始运行运行运行运行。 DRAM- XLDOUT。