模拟Nets: ML-HW 噪音 -- -- robust 小型ML 模型共同设计以及始终在模拟计算中计算模拟加速器 (AnalogNets: ML-HW Co-Design of Noise-robust TinyML Models and Always-On Analog Compute-in-Memory Accelerator)

Chuteng Zhou,Fernando Garcia Redondo,Julian Büchel,Irem Boybat,Xavier Timoneda Comas,S. R. Nandakumar,Shidhartha Das,Abu Sebastian,Manuel Le Gallo,Paul N. Whatmough

Always-on TinyML perception tasks in IoT applications require very high energy efficiency. Analog compute-in-memory (CiM) using non-volatile memory (NVM) promises high efficiency and also provides self-contained on-chip model storage. However, analog CiM introduces new practical considerations, including conductance drift, read/write noise, fixed analog-to-digital (ADC) converter gain, etc. These additional constraints must be addressed to achieve models that can be deployed on analog CiM with acceptable accuracy loss. This work describes $\textit{AnalogNets}$: TinyML models for the popular always-on applications of keyword spotting (KWS) and visual wake words (VWW). The model architectures are specifically designed for analog CiM, and we detail a comprehensive training methodology, to retain accuracy in the face of analog non-idealities, and low-precision data converters at inference time. We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator, with a novel layer-serial approach to remove the cost of complex interconnects associated with a fully-pipelined design. We evaluate the AnalogNets on a calibrated simulator, as well as real hardware, and find that accuracy degradation is limited to 0.8$\%$/1.2$\%$ after 24 hours of PCM drift (8-bit) for KWS/VWW. AnalogNets running on the 14nm AON-CiM accelerator demonstrate 8.58/4.37 TOPS/W for KWS/VWW workloads using 8-bit activations, respectively, and increasing to 57.39/25.69 TOPS/W with $4$-bit activations.

翻译：在 IOT 应用中, 总是 TinyML 的感知任务要求非常高的能效。使用非挥发性内存( NVM) 进行模拟计算( CiM ) 的模拟计算( CiM ) 将带来更高的效率, 并提供自足的芯片模型存储。然而, 模拟 Cim 引入了新的实际考虑, 包括导演流、读/ 字噪音、固定的模拟- 数字转换器( ADC ) 等。这些额外的限制必须得到解决, 才能在模拟 CIM 上安装模型, 并具有可接受的准确性损失。这项工作描述了 $: $: 透明( Textit{ Analogy) 流化( CiM ), 用于8- sws) 识别关键字( KWM ) 和直径( QM) 快速读数( ), 用于将 IM 系统/ 的模拟系统- 系统- 升级( ) 系统- 系统- 升级( ) 系统化( ) ) 和系统( 快速) 系统( ) 系统内部) 升级( ) 升级( ) ), 进行实时) 实时) 进行实时( 的实时) 的系统- 的系统- 的系统- 和系统- 实时) 系统- 系统- 系统- 升级(, 升级( 升级( ) 升级( ) 升级( ) ) 升级( ) 的系统-, 的实时),, 的系统- 实时) 的系统-,, 的系统-,,, 和系统- 系统- ) 系统- 系统- ) 系统-,,,,,, 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统- 系统-,,用于-,, 和,,,,, 和, 系统- 系统- 系统- 系统- 的