This paper aims to create a transition path from file-based IO to streaming-based workflows for scientific applications in an HPC environment. By using the openPMP-api, traditional workflows limited by filesystem bottlenecks can be overcome and flexibly extended for in situ analysis. The openPMD-api is a library for the description of scientific data according to the Open Standard for Particle-Mesh Data (openPMD). Its approach towards recent challenges posed by hardware heterogeneity lies in the decoupling of data description in domain sciences, such as plasma physics simulations, from concrete implementations in hardware and IO. The streaming backend is provided by the ADIOS2 framework, developed at Oak Ridge National Laboratory. This paper surveys two openPMD-based loosely-coupled setups to demonstrate flexible applicability and to evaluate performance. In loose coupling, as opposed to tight coupling, two (or more) applications are executed separately, e.g. in individual MPI contexts, yet cooperate by exchanging data. This way, a streaming-based workflow allows for standalone codes instead of tightly-coupled plugins, using a unified streaming-aware API and leveraging high-speed communication infrastructure available in modern compute clusters for massive data exchange. We determine new challenges in resource allocation and in the need of strategies for a flexible data distribution, demonstrating their influence on efficiency and scaling on the Summit compute system. The presented setups show the potential for a more flexible use of compute resources brought by streaming IO as well as the ability to increase throughput by avoiding filesystem bottlenecks.
翻译:本文旨在为HPC环境中的科学应用创造从基于文件的IMO到基于流流流的工作流程的过渡路径。 通过使用 OcopPMP- api, 受文件系统瓶颈限制的传统工作流程可以克服并灵活扩展, 以便进行现场分析。 OpenPMD- api 是用于根据Particle- Mesh Data( OpenPMDD) 的开放标准描述科学数据的一个图书馆。 它应对硬件异质性带来的近期挑战的方法在于将域科学的数据描述脱钩, 如等离子物理模拟, 从硬件和 IO 的具体实施。 流动后端由Oak Ridge National实验室开发的 ADIOS2 框架提供。 本文调查了两种基于开放PMD- 分散的配置, 以显示灵活适用性, 并评价业绩。 在松动的组合中, 两种( 或更多) 应用程序是分别执行的, 例如在单个的灵活 MPI 背景下, 并且通过交换数据。 通过统一的方式, 流基流流流的工作流程允许独立代码, 而不是快速的代码代码,,, 在快速的流中, 将数据配置中, 数据配置中, 数据分配中, 展示 显示 快速的资源分配中, 展示中, 展示中, 数据流流中, 展示 显示 数据流流中, 显示 的系统 显示 数据流中, 数据流中, 显示 流中, 流 显示 流 流 流 显示 数据流 显示 流 流 流 流 流 数据流 显示 显示 流 流 显示 流 显示 流 显示 显示 显示 数据流 流 流 流 流 流 流 流 流 流 流 流 。