Convolutional neural networks (CNN) have become a ubiquitous algorithm with growing applications in mobile and edge settings. We describe a compute-in-memory (CIM) technique called FPIRM using Racetrack Memory (RM) to accelerate CNNs for edge systems. Using transverse read, a technique that can determine the number of '1's multiple adjacent domains, FPIRM can efficiently implement multi-operand bulk-bitwise and addition computations, and two-operand multiplication. We discuss how FPIRM can implement both variable precision integer and floating point arithmetic. This allows both CNN inference and on-device training without expensive data movement to the cloud. Based on these functions we demonstrate implementation of several CNNs with back propagation using RM CIM and compare these to state-of-the-art implementations of CIM inference and training in Field-Programmable Gate Arrays. During training FPIRM improves by 2$\times$ the efficiency, by reducing the energy consumption by at least 27% and increasing the throughput by at least 18% against FPGA.
翻译:动态神经网络(CNN)已成为移动和边缘环境应用日益增长的动态神经网络(CNN)的无处不在的算法。我们描述了一种名为FPIRM(CIM)的计算模型技术,即使用极速内存(RM)加速有线电视新闻网的边缘系统。使用横反读这一技术可以确定“1”多个相邻域的数目,FPIRM可以有效地实施多操作和散数和增量计算,以及双操作和倍增。我们讨论了FPIRM如何在移动和边缘环境中应用可变精度整数和浮点算数。这样可以使CNN的推断和实时培训在不向云传播昂贵数据的情况下进行。根据这些功能,我们演示了使用RM CIM进行反向传播的若干有线电视新闻网的安装,并将这些功能与现场可探测门射门射线方面的最先进的推推力和培训进行比较。在培训期间,FPIRM将效率提高了2美元的时间,至少减少27%的能源消耗量,并至少增加18%对FGA的吞压。