The overhead of the kernel storage path accounts for half of the access latency for new NVMe storage devices. We explore using BPF to reduce this overhead, by injecting user-defined functions deep in the kernel's I/O processing stack. When issuing a series of dependent I/O requests, this approach can increase IOPS by over 2.5$\times$ and cut latency by half, by bypassing kernel layers and avoiding user-kernel boundary crossings. However, we must avoid losing important properties when bypassing the file system and block layer such as the safety guarantees of the file system and translation between physical blocks addresses and file offsets. We sketch potential solutions to these problems, inspired by exokernel file systems from the late 90s, whose time, we believe, has finally come!
翻译:内核存储路径的顶部占新 NVME 存储装置存取时间的半数。 我们探索使用 BPF 来减少这个顶部, 在内核的 I/ O 处理堆中深处注入用户定义功能。 当发布一系列依赖 I/ O 请求时, 这种方法可以将IOPS 增加2.5 美元以上, 并将内核层隔开, 避免用户内核边界过境点, 从而将内核封存时间减半。 但是, 我们必须避免在绕过文件系统和块层时丢失重要属性, 如文件系统的安全保障, 以及物理区块地址和文件偏移之间的翻译。 我们在90年代后期的外层文件系统启发下, 我们勾画出这些问题的潜在解决方案, 我们认为, 时间已经到了!