Skyhook:朝向一个箭型存储系统 (Skyhook: Towards an Arrow-Native Storage System)

With the ever-increasing dataset sizes, several file formats such as Parquet, ORC, and Avro have been developed to store data efficiently, save the network, and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1, 000, 000 reqs/sec, the CPU has become the bottleneck trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires an extensive understanding of the system internals. Previous approaches re-implemented functionality of data processing frameworks and access libraries for a particular storage system, a duplication of effort that might have to be repeated for different storage systems. This paper introduces a new design paradigm that allows extending programmable object storage systems to embed existing, widely used data processing frameworks and access libraries into the storage layer with no modifications. In this approach, data processing frameworks and access libraries can evolve independently from storage systems while leveraging distributed storage systems scale-out and availability properties. We present Skyhook, an example implementation of our design paradigm using Ceph, Apache Arrow, and Parquet. We provide a brief performance evaluation of Skyhook and discuss key results.

翻译：随着数据集规模不断扩大,已经开发出若干文件格式,如Parquet、ORC和Avro等,以便高效存储数据,拯救网络,以更多CPU使用的价格将带宽连接;然而,随着支持25-100Gb/s网络的出现,以及提供1 000 000 000 reqs/sec的存储装置的出现,CPU已成为瓶颈,试图在这些快设备中不断不断输入数据,结果使单个客户执行的数据存取图书馆经常受到CPU的制约,无法利用分布式存储系统的扩大效益。这一问题的一个有吸引力的解决办法是将减少数据的处理和过滤任务卸载到存储层。然而,修改传统存储系统以支持可编译的25-100 Gb/s网络和存储装置,往往很乏味,需要广泛了解系统的内部。以往的数据处理框架和进入图书馆的重新实施功能,不同存储系统的重复工作,这可能需要重复。本文介绍了一个新的设计模式,允许将可编程的存储对象系统扩展为储存现有、广泛使用的存储和过滤系统,而我们使用的储存框架则无法独立地利用数据库的储存框架和访问。