In this paper, we present GradPIM, a processing-in-memory architecture which accelerates parameter updates of deep neural networks training. As one of processing-in-memory techniques that could be realized in the near future, we propose an incremental, simple architectural design that does not invade the existing memory protocol. Extending DDR4 SDRAM to utilize bank-group parallelism makes our operation designs in processing-in-memory (PIM) module efficient in terms of hardware cost and performance. Our experimental results show that the proposed architecture can improve the performance of DNN training and greatly reduce memory bandwidth requirement while posing only a minimal amount of overhead to the protocol and DRAM area.
翻译:本文介绍GradPIM(GradPIM),这是一个加速深神经网络培训参数更新的模拟处理结构。作为近期内可能实现的模拟处理技术之一,我们提出了一个不会侵入现有记忆协议的渐进式简单建筑设计。扩大DDR4 SDRAM(DDR4 SDRAM)来利用银行集团平行化,使得我们在硬件成本和性能方面的处理中模块(PIM)的操作设计效率很高。我们的实验结果表明,拟议的结构可以改善DNN培训的性能,大大降低记忆带宽要求,同时只能给协议和DRAM(D)领域带来最低限度的管理费。