Memory disaggregation is being considered as a strong alternative to traditional architecture to deal with the memory under-utilization in data centers. Disaggregated memory can adapt to dynamically changing memory requirements for the data center applications like data analytics, big data, etc., that require in-memory processing. However, such systems can face high remote memory access latency due to the interconnect speeds. In this paper, we explore a rack-scale disaggregated memory architecture and discuss the various design aspects. We design a trace-driven simulator that combines an event-based interconnect and a cycle-accurate memory simulator to evaluate the performance of disaggregated memory system at the rack scale. Our study shows that not only the interconnect but the contention in the remote memory queues also adds significantly to remote memory access latency. We introduces a memory allocation policy to reduce the latency compared to the conventional policies. We conduct experiments using various benchmarks with diverse memory access patterns. Our study shows encouraging results towards the rack-scale memory disaggregation and acceptable average memory access latency.
翻译:内存分解被认为是处理数据中心内存利用不足的传统结构的有力替代物。分解的内存可以适应数据中心应用程序(如数据分析、大数据等)的动态变化的内存要求,这些应用程序需要模拟处理。然而,由于连接速度,这些系统可能面临高远程内存存存延缓期。在本文中,我们探索一个架式的分类内存结构,并讨论各种设计方面。我们设计了一种追踪驱动的模拟器,将基于事件的互连和循环-准确的内存模拟器结合起来,以评价在机架上分解的内存系统的性能。我们的研究显示,不仅互连性,而且远程内存列中的争议也极大地增加了远程内存存存延缓存。我们引入了一种记忆分配政策,以降低与常规政策相比的惯留时间长度。我们使用不同记忆存存存模式的各种基准进行实验。我们的研究显示,在分级内存分解和可接受的平均内存存延时,取得了令人鼓舞的成果。</s>