Upcoming HEP experiments, e.g. at the HL-LHC, are expected to increase the volume of generated data by at least one order of magnitude. In order to retain the ability to analyze the influx of data, full exploitation of modern storage hardware and systems, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes critical. To this end, the ROOT RNTuple I/O subsystem has been designed to address performance bottlenecks and shortcomings of ROOT's current state of the art TTree I/O subsystem. RNTuple provides a backwards-incompatible redesign of the TTree binary format and access API that evolves the ROOT event data I/O for the challenges of the upcoming decades. It focuses on a compact data format, on performance engineering for modern storage hardware, for instance through making parallel and asynchronous I/O calls by default, and on robust interfaces that are easy to use correctly. In this contribution, we evaluate the RNTuple performance for typical HEP analysis tasks. We compare the throughput delivered by RNTuple to popular I/O libraries outside HEP, such as HDF5 and Apache Parquet. We demonstrate the advantages of RNTuple for HEP analysis workflows and provide an outlook on the road to its use in production.
翻译:为此,ROOT RNTuple I/O子系统的设计旨在解决ROOT目前技术水平TTree I/O次系统的性能瓶颈和缺陷问题。RNTUPL提供对TTEP二进制格式进行反向不兼容的重新设计,并访问API,以发展ROOT事件I/O数据来应对未来几十年的挑战。它侧重于一个压缩数据格式,即现代储存硬件的性能工程,例如,通过并行和无节制的I/O调调,以及易于正确使用的坚固界面。在这一贡献中,我们评估RNTUPE对典型HEP分析任务的业绩,对HEP的二进制格式进行反向不兼容的重新设计,并访问API,为ROT/O开发未来几十年的挑战而发展ROOT事件I/O数据。我们将ROTU数据流程中的RUP 与RNTFA系统外的RUPUP 和RNTFAU的流程分析展示了RU的优势。