In-memory-computing is emerging as an efficient hardware paradigm for deep neural network accelerators at the edge, enabling to break the memory wall and exploit massive computational parallelism. Two design models have surged: analog in-memory-computing (AIMC) and digital in-memory-computing (DIMC), offering a different design space in terms of accuracy, efficiency and dataflow flexibility. This paper targets the fair comparison and benchmarking of both approaches to guide future designs, through a.) an overview of published architectures; b.) an analytical cost model for energy and throughput; c.) scheduling of workloads on a variety of modeled IMC architectures for end-to-end network efficiency analysis, offering valuable workload-hardware co-design insights.
翻译:暂无翻译