Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in mapping neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the weights in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.
翻译:最近,专门加速器作为一种降低神经网络推断力消耗效率的方法,最近引起了人们的关注。一个很有希望的加速器类别使用非挥发性的内存阵列来储存重量,并在阵列内进行$\textit{原地}美元模拟计算。虽然先前的工作探索了模拟加速器的设计空间,以优化性能和能源效率,但很少对这些加速器的准确性进行严格的评估。这项工作表明,建筑设计决定,特别是将神经网络参数映射到模拟记忆细胞中,会影响推断准确性。在用ResNet50在图像网上评估时,系统对模拟非理想性-细胞编程错误、模拟-数字转换器分辨率和阵列寄生阻力的耐力。在使硬件的模拟量与网络的重量成正比值成比例化时,对于细胞不精确性反应的适应力几乎相等,而不是在多个设备中散布重量比重点上,这种系统对模拟性比值的精确性结果,比比比比值的精确性更接近性,因此,比级级的精确性系统更需要更精确性。