Processing-in-memory (PIM) seeks to eliminate computation/memory data transfer using devices that support both storage and logic. Stateful logic techniques such as IMPLY, MAGIC and FELIX can perform logic gates within memristive crossbar arrays with massive parallelism. Multiplication via stateful logic is an active field of research due to the wide implications. Recently, RIME has become the state-of-the-art algorithm for stateful single-row multiplication by using memristive partitions, reducing the latency of the previous state-of-the-art by 5.1x. In this paper, we begin by proposing novel partition-based computation techniques for broadcasting and shifting data. Then, we design an in-memory multiplication algorithm based on the carry-save add-shift (CSAS) technique. Finally, we develop a novel stateful full-adder that significantly improves the state-of-the-art (FELIX) design. These contributions constitute MultPIM, a multiplier that reduces state-of-the-art time complexity from quadratic to linear-log. For 32-bit numbers, MultPIM improves latency by an additional 4.2x over RIME, while even slightly reducing area overhead. Furthermore, we optimize MultPIM for full-precision matrix-vector multiplication and improve latency by 25.5x over FloatPIM matrix-vector multiplication.
翻译:PIM 试图用支持存储和逻辑的装置消除计算/模拟数据传输。 IMPLY、 MAGIC 和 FELIX 等状态逻辑技术可以在弥漫的跨条形阵列中用大量平行的超线阵列运行逻辑门。 由于具有广泛的影响, 光学逻辑的乘法是一个积极的研究领域。 最近, RIME 已经成为了使用中间分区进行状态性单行倍增的最先进的算法, 降低了5. 5x 先前状态的静态。 在本文中, 我们首先提出基于新颖的基于分区的计算技术, 用于广播和移动数据。 然后, 我们设计了一个基于随传加转( CSAS) 技术的模拟倍增算法。 最后, 我们开发了一个新的状态全局全局算法, 大大改进了状态( FELIX ) 的矩阵设计。 这些贡献构成 mutPIM, 一种超常值的乘数, 降低状态时间复杂性, 从平面的平面平面平面平面平面平面平面平面平面平面平面平面平面。