The hybrid MPI+X programming paradigm, where X refers to threads or GPUs, has gained prominence in the high-performance computing arena. This corresponds to a trend of system architectures growing more heterogeneous. The current MPI standard only specifies the compatibility levels between MPI and threading runtimes. No MPI concept or interface exists for applications to pass thread context or GPU stream context to MPI implementations explicitly. This lack has made performance optimization complicated in some cases and impossible in other cases. We propose a new concept in MPI, called MPIX stream, to represent the general serial execution context that exists in X runtimes. MPIX streams can be directly mapped to threads or GPU execution streams. Passing thread context into MPI allows implementations to precisely map the execution contexts to network endpoints. Passing GPU execution context into MPI allows implementations to directly operate on GPU streams, lowering the CPU/GPU synchronization cost.
翻译:X 指的是线条或 GPUs 的混合 MPI+X 编程模式在高性能计算舞台上已变得显着。 这与系统结构日益多样化的趋势相对应。 当前 MPI 标准只规定了 MPI 和 线线运行时间之间的兼容度。 没有 MPI 概念或界面可用于将线条上下文或 GPU 流上下文传递到 MPI 执行程序。 这种缺乏使得绩效优化在某些情况下变得复杂,在其他情况下是不可能的。 我们在 MPI 中提出了一个新的概念, 称为 MPIX 流, 以代表 X 运行时存在的普通序列执行环境。 MPIX 流可以直接被映射为线条或 GPU 执行流。 在 MPI 中传递线流, 允许执行将执行环境精确地映射到网络端点。 在 MPI 中, 将 GPU 执行环境允许在 GPU 流上直接操作, 降低 CPU/ GPU 同步成本 。