Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents different techniques to improve performance through a Software Defined Memory. We show how underlying technologies such as Nand Flash and 3DXP differentiate, and relate to real world scenarios, enabling from 5% to 29% power savings.
翻译:深学习建议模型(DLRM)非常广泛,占了相当大的数据中心足迹,每年增长1.5x以上。模型规模很快将达到百万字节范围,利用存储类记忆(SCM)进行推论可以降低电耗和成本。本文评估了将存储层结构扩大到DLRM的存储层所面临的主要挑战,并介绍了通过软件定义内存改进性能的不同技术。我们展示了Nand Flash和3DXP等基础技术如何区别和与真实世界情景相关,从而实现了5%到29%的节能。