Non-Volatile Memories (NVMs) such as Resistive RAM (RRAM) are used in neuromorphic systems to implement high-density and low-power analog synaptic weights. Unfortunately, an RRAM cell can switch its state after reading its content a certain number of times. Such behavior challenges the integrity and program-once-read-many-times philosophy of implementing machine learning inference on neuromorphic systems, impacting the Quality-of-Service (QoS). Elevated temperatures and frequent usage can significantly shorten the number of times an RRAM cell can be reliably read before it becomes absolutely necessary to reprogram. We propose an architectural solution to extend the read endurance of RRAM-based neuromorphic systems. We make two key contributions. First, we formulate the read endurance of an RRAM cell as a function of the programmed synaptic weight and its activation within a machine learning workload. Second, we propose an intelligent workload mapping strategy incorporating the endurance formulation to place the synapses of a machine learning model onto the RRAM cells of the hardware. The objective is to extend the inference lifetime, defined as the number of times the model can be used to generate output (inference) before the trained weights need to be reprogrammed on the RRAM cells of the system. We evaluate our architectural solution with machine learning workloads on a cycle-accurate simulator of an RRAM-based neuromorphic system. Our results demonstrate a significant increase in inference lifetime with only a minimal performance impact.
翻译:在神经形态系统中,使用耐性内存(RRAM)等非挥发性记忆(NVM),以实施高密度和低功率的神经模拟合成重量。不幸的是,RRAM细胞在阅读内容一定次数后可以转换状态。这种行为对神经形态系统实施机器学习推断的完整性和程序即读即时理论提出了挑战,从而影响服务质量。高温和频繁使用可以大大缩短RRAM细胞在重新规划绝对必要之前可以可靠阅读的次数。我们建议了扩大RERAM神经形态系统读性力的建筑解决方案。我们做出了两项关键贡献。首先,我们将RAM细胞的读性内存性作为程序合成力的函数,并在机器学习工作量中激活。第二,我们提议了一个智能的工作量映射战略,将 RRAM 的神经结构模型的神经结构结构结构结构模型的精度调整成一个目标值。在RERAM 将我们经过训练的建筑结构结构结构结构结构模型的精度序列中,将我们的时间序列的精度调整成一个目标。