定制化存储设计的逻辑存储器：与传统存储器相比的缺陷和改进 (Custom Memory Design for Logic-in-Memory: Drawbacks and Improvements over Conventional Memories)

The speed of modern digital systems is severely limited by memory latency (the ``Memory Wall'' problem). Data exchange between Logic and Memory is also responsible for a large part of the system energy consumption. Logic--In--Memory (LiM) represents an attractive solution to this problem. By performing part of the computations directly inside the memory the system speed can be improved while reducing its energy consumption. LiM solutions that offer the major boost in performance are based on the modification of the memory cell. However, what is the cost of such modifications? How do these impact the memory array performance? In this work, this question is addressed by analysing a LiM memory array implementing an algorithm for the maximum/minimum value computation. The memory array is designed at physical level using the FreePDK $\SI{45}{\nano\meter}$ CMOS process, with three memory cell variants, and its performance is compared to SRAM and CAM memories. Results highlight that read and write operations performance is worsened but in--memory operations result to be very efficient: a 55.26\% reduction in the energy--delay product is measured for the AND operation with respect to the SRAM read one; therefore, the LiM approach represents a very promising solution for low--density and high--performance memories.

翻译：现代数字系统的速度受到存储延迟的严重限制（“存储墙”问题）。逻辑与存储器之间的数据交换也占系统能耗的很大一部分。逻辑存储器（LiM）代表了解决这个问题的有吸引力的解决方案。通过在存储器内部直接执行部分计算，可以提高系统速度，同时降低能耗。提供主要性能提升的LiM解决方案基于修改存储单元，但是这种修改的代价是什么？它们如何影响存储阵列的性能？本文通过分析实现最大/最小值计算算法的LiM存储器阵列来回答这个问题。该存储器阵列在FreePDK $\SI{45}{\nano\meter}$ CMOS工艺上设计出物理级，使用了三种存储单元变体，并将其性能与SRAM和CAM存储器进行比较。结果表明，读写操作的性能变差，但内存操作非常高效：与SRAM读操作相比，AND操作的能量延迟积减少了55.26％。因此，LiM方法为低密度和高性能存储器提供了非常有前途的解决方案。