Modern heterogeneous supercomputing systems are comprised of compute blades that offer CPUs and GPUs. On such systems, it is essential to move data efficiently between these different compute engines across a high-speed network. While current generation scientific applications and systems software stacks are GPU-aware, CPU threads are still required to orchestrate data moving communication operations and inter-process synchronization operations. A new GPU stream-aware MPI communication strategy called stream-triggered (ST) communication is explored to allow offloading both computation and communication control paths to the GPU. The proposed ST communication strategy is implemented on HPE Slingshot Interconnects over a new proprietary HPE Slingshot NIC (Slingshot 11) using the supported triggered operations feature. Performance of the proposed new communication strategy is evaluated using a microbenchmark kernel called Faces, based on the nearest-neighbor communication pattern in the CORAL-2 Nekbone benchmark, over a heterogeneous node architecture consisting of AMD CPUs and GPUs.
翻译:现代多式超级计算系统由提供CPU和GPU的计算刀片组成。 在这种系统中,必须在这些不同的计算引擎之间通过高速网络高效率地移动数据。虽然当前产生的科学应用和系统软件堆叠是GPU-aware,但仍然需要CPU线来协调数据移动通信操作和进程间同步操作。正在探索一种名为“溪流触发通信(ST)通信”的新的GPU流识别 MPI通信战略,以便能够从计算和通信控制路径上卸载到GPU。拟议的ST通信战略是在HPE Slingshot Inter上应用所支持的触发操作特性在新的专有HPE Slingshot Nic(11号)上执行的。拟议的新通信战略的性能是使用CORAL-2 Nekbone基准中以近邻通信模式为基础的微波马克内核面,在由AMD CPU和GPUP构成的混合节点结构上进行评估。