Modern NVMe SSDs and RDMA networks provide dramatically higher bandwidth and concurrency. Existing networked storage systems (e.g., NVMe over Fabrics) fail to fully exploit these new devices due to inefficient storage ordering guarantees. Severe synchronous execution for storage order in these systems stalls the CPU and I/O devices and lowers the CPU and I/O performance efficiency of the storage system. We present Rio, a new approach to the storage order of remote storage access. The key insight in Rio is that the layered design of the software stack, along with the concurrent and asynchronous network and storage devices, makes the storage stack conceptually similar to the CPU pipeline. Inspired by the CPU pipeline that executes out-of-order and commits in-order, Rio introduces the I/O pipeline that allows internal out-of-order and asynchronous execution for ordered write requests while offering intact external storage order to applications. Together with merging consecutive ordered requests, these design decisions make for write throughput and CPU efficiency close to that of orderless requests. We implement Rio in Linux NVMe over RDMA stack, and further build a file system named RioFS atop Rio. Evaluations show that Rio outperforms Linux NVMe over RDMA and a state-of-the-art storage stack named Horae by two orders of magnitude and 4.9 times on average in terms of throughput of ordered write requests, respectively. RioFS increases the throughput of RocksDB by 1.9 times and 1.5 times on average, against Ext4 and HoraeFS, respectively.
翻译:现有网络存储系统(如NVME对Fabrics的NVME)无法充分利用这些新装置,因为储存订单保障效率低。这些系统的存储程序执行严重同步,使CPU和I/O设备处于停顿状态,降低了存储系统的存储程序的效率。我们向里约展示了远程存储访问的存储顺序的新方法,即:对远程存储访问的存储顺序采取了新的办法。里约的关键见解是,软件堆的多层设计,以及同时和不同步的网络和存储装置,使存储库在概念上与CPU管道相似。受这些系统运行不便和按顺序运行的CPU管道的启发,里约引入了I/O管道,允许内部异常和I/O/O的运行效率,同时为应用程序提供了完整的外部存储订单。除了合并外订购的请求外,这些设计决定使得代码和CPU的效率更加接近无序的网络和存储装置。我们分别在LIOOOFS的运行过程中,通过IMFS的平均时间,通过RRFS的运行时间,通过R-RFS IMA的平均时间,通过RR-RA IMA的运行,通过SDRA的平-RA的平序,通过RA的平流-RA的平流的平序的平序的平序的平序的平序评估,通过RDRDRDRA的平序的平流的平序的平序的平序,通过RA的平序的平序的平序的平序,在RA的平序的平序的平序,在RA的平序的平序的平序的平序,在RDRDRA的平时段,在RDRA的平序的平序的平序的平序的平序,在RA的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序的平序