Deep Learning neural networks are pervasive, but traditional computer architectures are reaching the limits of being able to efficiently execute them for the large workloads of today. They are limited by the von Neumann bottleneck: the high cost in energy and latency incurred in moving data between memory and the compute engine. Today, special CMOS designs address this bottleneck. The next generation of computing hardware will need to eliminate or dramatically mitigate this bottleneck. We discuss how compute-in-memory can play an important part in this development. Here, a non-volatile memory based cross-bar architecture forms the heart of an engine that uses an analog process to parallelize the matrix vector multiplication operation, repeatedly used in all neural network workloads. The cross-bar architecture, at times referred to as a neuromorphic approach, can be a key hardware element in future computing machines. In the first part of this review we take a co-design view of the design constraints and the demands it places on the new materials and memory devices that anchor the cross-bar architecture. In the second part, we review what is knows about the different new non-volatile memory materials and devices suited for compute in-memory, and discuss the outlook and challenges.
翻译:深深学习神经网络是普遍存在的, 但传统的计算机结构正在达到极限, 能够高效地执行它们, 以完成今天的庞大工作量。 它们受到 von Neumann 瓶颈的限制: 在存储器和计算引擎之间移动数据过程中产生的高能量和延迟成本。 今天, 特殊的 CMOS 设计可以解决这一瓶颈问题。 下一代计算机硬件需要消除或大幅缓解这一瓶颈问题。 我们讨论的是, 如何在这种发展中计算元件, 如何在这个发展中扮演重要角色。 在这里, 一个非挥发性记忆的跨栏结构构成了一个引擎的心脏, 它使用模拟程序将矩阵矢量的倍增操作平行化, 在所有神经网络工作量中反复使用。 跨栏结构, 有时被称为神经形态方法, 可以成为未来计算机机器中的关键硬件要素。 在本次审查的第一部分, 我们共同设计设计制约, 以及它给支撑跨栏结构的新材料和记忆装置带来的要求。 在第二部分, 我们审查关于不同新不动态的、 不变化的存储器和适应性存储器中, 我们审视了什么是新的非动态的。