Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment. State-of-the-art models comprise large embedding tables that have billions of parameters requiring large memory capacities. Unfortunately, large and fast DRAM-based memories levy high infrastructure costs. Conventional SSD-based storage solutions offer an order of magnitude larger capacity, but have worse read latency and bandwidth, degrading inference performance. RecSSD is a near data processing based SSD memory system customized for neural recommendation inference that reduces end-to-end model inference latency by 2X compared to using COTS SSDs across eight industry-representative models.
翻译:包括搜索、社交媒体和娱乐在内的一系列广泛的数据中心应用中采用了神经化个人化建议模型。最先进的模型包括大型嵌入表,其中含有数十亿项参数,需要巨大的记忆能力。不幸的是,大型和快速的DRAM型记忆需要高昂的基础设施成本。基于常规SSD的存储解决方案提供了数量级更大的容量,但读力和带宽性差,降低推力性能。RESSD是一个近乎数据处理的SSD记忆系统,专门为神经建议推理而定制,将终端到终端模型推导延率比八个行业代表模型使用COTSDSD减少2X。