TxSim: 深神经网络抗抗力跨横横梁系统模拟培训 (TxSim:Modeling Training of Deep Neural Networks on Resistive Crossbar Systems)

Resistive crossbars have attracted significant interest in the design of Deep Neural Network (DNN) accelerators due to their ability to natively execute massively parallel vector-matrix multiplications within dense memory arrays. However, crossbar-based computations face a major challenge due to a variety of device and circuit-level non-idealities, which manifest as errors in the vector-matrix multiplications and eventually degrade DNN accuracy. To address this challenge, there is a need for tools that can model the functional impact of non-idealities on DNN training and inference. Existing efforts towards this goal are either limited to inference, or are too slow to be used for large-scale DNN training. We propose TxSim, a fast and customizable modeling framework to functionally evaluate DNN training on crossbar-based hardware considering the impact of non-idealities. The key features of TxSim that differentiate it from prior efforts are: (i) It comprehensively models non-idealities during all training operations (forward propagation, backward propagation, and weight update) and (ii) it achieves computational efficiency by mapping crossbar evaluations to well-optimized BLAS routines and incorporates speedup techniques to further reduce simulation time with minimal impact on accuracy. TxSim achieves orders-of-magnitude improvement in simulation speed over prior works, and thereby makes it feasible to evaluate training of large-scale DNNs on crossbars. Our experiments using TxSim reveal that the accuracy degradation in DNN training due to non-idealities can be substantial (3%-10%) for large-scale DNNs, underscoring the need for further research in mitigation techniques. We also analyze the impact of various device and circuit-level parameters and the associated non-idealities to provide key insights that can guide the design of crossbar-based DNN training accelerators.

翻译：严格截截面已引起对深神经网络加速器设计的巨大兴趣,原因是它们有能力在密集的内存阵列中本地执行大规模平行矢量矩阵倍增。然而,跨截面计算面临重大挑战,原因是各种设备和电路级非理想值存在误差,这些误差表现为矢量矩阵倍增,最终会降低 DNN 的准确性。为了应对这一挑战,需要一些工具来模拟非理想性对 DN 培训和推断的功能影响。目前为实现这一目标所作的努力要么限于推断,要么过于缓慢,无法用于大型 DNN 培训。我们提议TxSim,一个快速和可定制的模型框架,以功能性地评价 DNNE 的跨端硬件培训,这与先前的努力相关联的是: (一) 全面模拟所有培训操作(前一列、后期传播、重量更新)中的非理想性评估。