Hardware-agnostic programming with high performance portability will be the bedrock for realizing the ubiquitous adoption of emerging accelerator technologies in future heterogeneous high-performance computing (HPC) systems, which is the key to achieving the next level of HPC performance on an expanding accelerator landscape. In this paper, we present HALO 1.0, an open-ended extensible multi-agent software framework that implements a set of proposed hardware-agnostic accelerator orchestration (HALO) principles and a novel compute-centric message passing interface (C^2MPI) specification for enabling the portable and performance-optimized execution of hardware-agnostic application host codes across heterogeneous accelerator resources. The experiment results of evaluating eight widely used HPC subroutines based on Intel Xeon E5-2620 v4 CPUs, Intel Arria 10 GX FPGAs, and NVIDIA GeForce RTX 2080 Ti GPUs show that HALO 1.0 allows for a unified control flow for the host program to run across all the computing devices with a consistently maximum performance portability score of 1.0, which is 2x-861,883x higher than the OpenCL-based solution that suffers from an unstably low performance portability score. of the documentation of their work.
翻译:高性能高性能高性能高性能高性能计算(HPC)系统中新兴加速器技术的普及应用,是实现未来多种不同性能高性能计算(HPC)系统中新兴加速器技术普遍采用的基础,这是在扩大加速器景观中实现下一级HPC性能的关键。在本文中,我们介绍了基于Intel Xeon E5-2620 v4 CPUs、Intel Ariza 10 GX FPGAs、NVIDIA RTX 2080 Ti GPUs等开放式多试样多试剂软件框架,以实施一套拟议硬件性能加速器协调器(HALO)原则,以及新的计算中心信息传递界面(C%2MPI)规格,使硬性性能自动优化地应用主机主机编码在多种性能加速器资源中得以执行。基于 Intel Xeon E5-2620 v4 CPUs的八次广泛使用的HPC子程序实验结果。 Intel Arime 10 GX FPastst, 2080 TiPPPPl 显示,HL 1.01允许主机程序的统一控制流程流程运行运行到所有高性水平的可达度,其10-CFortix 的可达分级标准。