The Big Data trend is putting strain on modern storage systems, which have to support high-performance I/O accesses for the large quantities of data. With the prevalent Von Neumann computing architecture, this data is constantly moved back and forth between the computing (i.e., CPU) and storage entities (DRAM, Non-Volatile Memory NVM storage). Hence, as the data volume grows, this constant data movement between the CPU and storage devices has emerged as a key performance bottleneck. To improve the situation, researchers have advocated to leverage computational storage devices (CSDs), which offer a programmable interface to run user-defined data processing operations close to the storage without excessive data movement, thus offering performance improvements. However, despite its potential, building CSD-aware applications remains a challenging task due to the lack of exploration and experimentation with the right API and abstraction. This is due to the limited accessibility to latest CSD/NVM devices, emerging device interfaces, and closed-source software internals of the devices. To remedy the situation, in this work we present an open-source CSD prototype over emerging NVMe Zoned Namespaces (ZNS) SSDs and an interface that can be used to explore application designs for CSD/NVM storage devices. In this paper we summarize the current state of the practice with CSD devices, make a case for designing a CSD prototype with the ZNS interface and eBPF (ZCSD), and present our initial findings. The prototype is available at https://github.com/Dantali0n/qemu-csd.
翻译:大数据趋势正在给现代存储系统造成压力,因为这些系统必须支持高性能的 I/O 访问,以获取大量数据。随着流行的Von Neumann计算结构,这些数据在计算(即CPU)和存储实体(DRAM、非流动内存NVM存储)之间不断回转。因此,随着数据量的增加,CPU和存储装置之间的这种数据流动已成为一个重要的性能瓶颈。为了改善这种情况,研究人员主张利用计算存储设备(CSDs),这种设备提供了一个可编程的界面,用于在离存储不易的数据移动的情况下运行用户定义的数据处理操作,从而提供了性能改进。然而,尽管这一数据具有潜力,建立CSDSW-aware应用程序仍然是一项艰巨的任务,因为缺乏对API和抽象的探索和实验。这是由于对CSD/NVMSM系统的最新设备、新设备界面和SISDSDSD系统的新设计系统。