The rise of data-intensive applications exposed the limitations of conventional processor-centric von-Neumann architectures that struggle to meet the off-chip memory bandwidth demand. Therefore, recent innovations in computer architecture advocate compute-in-memory (CIM) and compute-near-memory (CNM), non-von- Neumann paradigms achieving orders-of-magnitude improvements in performance and energy consumption. Despite significant technological breakthroughs in the last few years, the programmability of these systems is still a serious challenge. Their programming models are too low-level and specific to particular system implementations. Since such future architectures are predicted to be highly heterogenous, developing novel compiler abstractions and frameworks become necessary. To this end, we present CINM (Cinnamon), a first end-to-end compilation flow that leverages the hierarchal abstractions to generalize over different CIM and CNM devices and enable device-agnostic and device-aware optimizations. Cinnamon progressively lowers input programs and performs optimizations at each level in the lowering pipeline. To show its efficacy, we evaluate CINM on a set of benchmarks for the well-known UPMEM CNM system and the memristors-based CIM accelerators. We show that Cinnamon, supporting multiple hardware targets, generates high-performance code comparable to or better than state-of-the-art implementations.
翻译:数据密集型应用的兴起暴露了传统处理器中心von-Neumann结构的局限性,这些传统处理器中心von-Neumann结构难以满足离芯内存带宽度需求。因此,最近计算机架构的创新举措倡导计算模拟(CIM)和计算近模(CNM),非 von-Neumann模式,在性能和能源消耗方面实现微量级改进。尽管在过去几年中取得了重大技术突破,但这些系统的可编程性仍然是一个严重挑战。它们的编程模型对于特定的系统实施来说,级别太低,而且太特殊。由于预测这些未来架构会高度偏差,因此有必要开发新的编程抽象和框架。为此,我们向CINM(Cinnamonon)介绍第一个端到端的编程流,利用上层图的抽象图集来普及不同的CIM和CNM装置,使基于装置的精度和装置的精度优化。辛纳蒙逐渐降低输入程序,并在支持下层输管的每个级别上进行优化。为了展示CIM的可比较的CIM标准,我们评估CIM的CIM的CIM的性标值,我们向更接近的CNM 展示了CIM的CEM的高级基准。