Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.
翻译:在冯纽曼建筑中执行数据密集型任务由于内存墙瓶颈而对实现高性能和电能效率都具有挑战性,因为内存中存中由于记忆墙瓶颈,实现高性能和电能效率既具有挑战性,也具有挑战性。econtric-memory(CiM)是一种很有希望的缓解方法,在外围界面和数据路径的支持下,能够在记忆中平行进行现场倍增(MAC)操作。基于SRAM的充电-dom-come(CD-CiM)显示其提高电力效率和计算准确性的潜力。然而,现有基于SRAM的CD-CiM(CD-CiM)在满足高性能多位方位化应用的吞吐要求方面面临着巨大的挑战。本文展示了基于SRAM(SRAM)的高通量递增率的CD-CiM(M)宏。它能够完成一个CIM(CiM)循环中两个已签署的8b矢量载器的MAC(C/D转换)。除了对模拟计算和A/D转换接口的非线性补偿外,这项工作达到51.2GOPS(51.2)和10.03TOPS/W/W(10)的能效,同时显示886%的精确度。