Contrary to the other resources such as CPU, memory, and network, for which virtualization is efficiently achieved through direct access, disk virtualization is peculiar. In this paper, we make four contributions. Our first contribution is the characterization of disk utilization in a public large-scale cloud infrastructure. It reveals the presence of long snapshot chains, sometimes composed of up to 1000 files. Our second contribution is to show that long chains lead to performance and memory footprint scalability issues by experimental measurements. Our third contribution is the extension of the Qcow2 format and its driver in Qemu to address the identified scalability challenges. Our fourth contribution is the thorough evaluation of our prototype, called sQemu, demonstrating that it brings significant performance enhancements and memory footprint reduction. For example, it improves the throughput of RocksDB by about 48% compared to vanilla Qemu on a snapshot chain of length 500. The memory overhead on that chain is also reduced by 15x.
翻译:与通过直接存取而有效实现虚拟化的CPU、记忆和网络等其他资源相反,磁盘虚拟化是奇特的。在本文中,我们做出了4项贡献。我们的第一个贡献是在公共大型云层基础设施中对磁盘的利用进行定性。它揭示了长快片链的存在,有时由1000个文件组成。我们的第二个贡献是显示长链通过实验测量导致性能和记忆足迹可缩放问题。我们的第三个贡献是扩展Qcow2格式及其在Qemu的驱动器,以应对所查明的可扩缩性挑战。我们的第四个贡献是彻底评估我们的原型,称为SQemu,表明它带来显著的性能增强和记忆足迹减少。例如,它使RocksDB的吞吐量比500长的Vanilla Qemu增加了约48%。该链上的记忆管理量也减少了15倍。