Programming efficiently heterogeneous systems is a major challenge, due to the complexity of their architectures. Intel oneAPI, a new and powerful standards-based unified programming model, built on top of SYCL, addresses these issues. In this paper, oneAPI is provided with co-execution strategies to run the same kernel between different devices, enabling the exploitation of static and dynamic policies. On top of that, static and dynamic load-balancing algorithms are integrated and analyzed. This work evaluates the performance and energy efficiency for a well-known set of regular and irregular HPC benchmarks, using an integrated GPU and CPU. Experimental results show that co-execution is worthwhile when using dynamic algorithms, improving efficiency even more when using unified shared memory.
翻译:由于其结构的复杂性,编程效率高的多元系统是一个重大挑战。 Intel API是建立在SYCL之上的、基于标准的、以标准为基础的、新的和强大的统一编程模式,它处理这些问题。在本文中,向一个API提供了共同执行战略,以在不同装置之间运行相同的内核,从而能够利用静态和动态的政策。此外,还综合并分析了静态和动态的负载平衡算法。这项工作利用一个综合的GPU和CPU来评估一套众所周知的定期和不规则的HPC基准的性能和能源效率。实验结果表明,在使用动态算法时,共同执行是值得的,在使用统一的记忆时,甚至可以提高效率。