Traditional memory management suffers from metadata overhead, architectural complexity, and stability degradation, problems intensified in cloud environments. Existing software/hardware optimizations are insufficient for cloud computing's dual demands of flexibility and low overhead. This paper presents Vmem, a memory management architecture for in-production cloud environments that enables flexible, efficient cloud server memory utilization through lightweight reserved memory management. Vmem is the first such architecture to support online upgrades, meeting cloud requirements for high stability and rapid iterative evolution. Experiments show Vmem increases sellable memory rate by about 2%, delivers extreme elasticity and performance, achieves over 3x faster boot time for VFIO-based virtual machines (VMs), and improves network performance by about 10% for DPU-accelerated VMs. Vmem has been deployed at large scale for seven years, demonstrating efficiency and stability on over 300,000 cloud servers supporting hundreds of millions of VMs.
翻译:传统内存管理存在元数据开销大、架构复杂、稳定性下降等问题,在云环境中这些问题尤为突出。现有的软硬件优化方案难以同时满足云计算对灵活性和低开销的双重要求。本文提出Vmem,一种面向生产云环境的内存管理架构,通过轻量级预留内存管理实现灵活高效的云服务器内存利用。Vmem是首个支持在线升级的此类架构,满足了云环境对高稳定性和快速迭代演进的需求。实验表明,Vmem将可售内存率提升约2%,提供极致的弹性与性能,使基于VFIO的虚拟机(VM)启动时间加快3倍以上,并为DPU加速的虚拟机提升约10%的网络性能。Vmem已大规模部署七年,在超过30万台云服务器上支持数亿虚拟机,证明了其高效性与稳定性。