This paper aims to create a transition path from file-based IO to streaming-based workflows for scientific applications in an HPC environment. By using the openPMP-api, traditional workflows limited by filesystem bottlenecks can be overcome and flexibly extended for in situ analysis. The openPMD-api is a library for the description of scientific data according to the Open Standard for Particle-Mesh Data (openPMD). Its approach towards recent challenges posed by hardware heterogeneity lies in the decoupling of data description in domain sciences, such as plasma physics simulations, from concrete implementations in hardware and IO. The streaming backend is provided by the ADIOS2 framework, developed at Oak Ridge National Laboratory. This paper surveys two openPMD-based loosely coupled setups to demonstrate flexible applicability and to evaluate performance. In loose coupling, as opposed to tight coupling, two (or more) applications are executed separately, e.g. in individual MPI contexts, yet cooperate by exchanging data. This way, a streaming-based workflow allows for standalone codes instead of tightly-coupled plugins, using a unified streaming-aware API and leveraging high-speed communication infrastructure available in modern compute clusters for massive data exchange. We determine new challenges in resource allocation and in the need of strategies for a flexible data distribution, demonstrating their influence on efficiency and scaling on the Summit compute system. The presented setups show the potential for a more flexible use of compute resources brought by streaming IO as well as the ability to increase throughput by avoiding filesystem bottlenecks.
翻译:本文旨在为HPC环境中的科学应用创造从基于文件的IO到基于流流的工作流程的过渡路径。 通过使用公开的PMP-api, 受文件系统瓶颈限制的传统工作流程可以克服并灵活扩展, 以便进行现场分析。 开放的PMD- api 是用于根据开放的Poart-Mesh Data( OpenPDD) 标准描述科学数据的一个图书馆。 它应对硬件异质性带来的近期挑战的方法在于将域科学的数据描述脱钩, 如离析等离子物理模拟、硬件和IO的具体实施。 流后端由Oak Ridge National实验室开发的ADIOS2框架提供。 本文调查了两种基于开放的PMD-api 的松散组合, 以显示灵活适用性, 并评价绩效。 在松散的组合中, 两种( 或更多) 应用程序是单独执行的, 例如在单个的MPI 上展示灵活性数据, 并且通过交换数据, 流基流的工作流程允许独立代码代码代码, 而不是精确的代码拼装的代码,, 快速的配置的传播能力, 展示在快速的系统上, 快速的配置中, 快速的配置中, 展示 展示的 快速的系统上, 快速的配置的配置的配置的 展示的 展示的 展示 快速的 展示的 展示中, 快速的策略 展示的 的 。