Digital processing-in-memory (PIM) architectures are rapidly emerging to overcome the memory-wall bottleneck by integrating logic within memory elements. Such architectures provide vast computational power within the memory itself in the form of parallel bitwise logic operations. We develop novel algorithmic techniques for PIM that, combined with new perspectives on computer arithmetic, extend this bitwise parallelism to the four fundamental arithmetic operations (addition, subtraction, multiplication, and division), for both fixed-point and floating-point numbers, and using both bit-serial and bit-parallel approaches. We propose a state-of-the-art suite of arithmetic algorithms, demonstrating the first algorithm in the literature of digital PIM for a majority of cases - including cases previously considered impossible for digital PIM, such as floating-point addition. Through a case study on memristive PIM, we compare the proposed algorithms to an NVIDIA RTX 3070 GPU and demonstrate significant throughput and energy improvements.
翻译:数字处理内存 (PIM) 架构正在迅速发展,通过在内存元素中集成逻辑来克服内存瓶颈。 这种架构以并行的比特逻辑操作的形式在内存中提供了大量的计算能力。 我们为 PIM 开发了新的算法技术,并与计算机算术的新视角相结合,将这种比特并行性扩展到四个基本算术运算(加法、减法、乘法和除法),同时使用比特串和比特并行方法。 我们提出了一套最先进的算法套件,证明了数字 PIM 文献中包括以前被认为对于数字 PIM 不可能的大多数情况在内的第一个算法,比如浮点加法。 通过对 memristive PIM 进行案例研究,我们将所提出的算法与 NVIDIA RTX 3070 GPU 进行比较,并证明了显著的吞吐量和能量改进。