In-memory computing is becoming a popular architecture for deep-learning hardware accelerators recently due to its highly parallel computing, low power, and low area cost. However, in-RRAM computing (IRC) suffered from large device variation and numerous nonideal effects in hardware. Although previous approaches including these effects in model training successfully improved variation tolerance, they only considered part of the nonideal effects and relatively simple classification tasks. This paper proposes a joint hardware and software optimization strategy to design a hardware-robust IRC macro for object detection. We lower the cell current by using a low word-line voltage to enable a complete convolution calculation in one operation that minimizes the impact of nonlinear addition. We also implement ternary weight mapping and remove batch normalization for better tolerance against device variation, sense amplifier variation, and IR drop problem. An extra bias is included to overcome the limitation of the current sensing range. The proposed approach has been successfully applied to a complex object detection task with only 3.85\% mAP drop, whereas a naive design suffers catastrophic failure under these nonideal effects.
翻译:最近,由于高度平行的计算、低功率和低面积成本,模拟计算正在成为一个受欢迎的深学习硬件加速器结构。然而,在RRAM(IRC)中,由于设备变异和硬件中的许多非理想效应而受到影响。虽然以前在模型培训中包括这些效应在内的方法成功地提高了变异容忍度,但是它们只考虑了非理想效应和相对简单的分类任务的一部分。本文件提议了一个联合硬件和软件优化战略,用于设计用于物体探测的硬件-紫外光 IRC 宏。我们通过使用低单线电压来降低单元格流,以便在一个能够最大限度地减少非线性附加效应影响的操作中进行完整的共振计算。我们还进行了线性重力测绘,并消除批次正常化,以更好地容忍设备变异、感官变异和IR落落问题。还增加了一种偏差,以克服当前感测范围的限制。所提议的方法已成功地应用于复杂的物体探测任务,只有3.85 ⁇ mAP 下降,而天真设计在非线效应下遭受灾难性的失败。