项目名称: 异构系统上基于任务窃取的负载平衡研究
项目编号: No.61303059
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 马文静
作者单位: 中国科学院软件研究所
项目金额: 23万元
中文摘要: 当今世界主流的超级计算系统往往不是单一结构的,很多都配有GPU等加速器。在大规模计算当中,负载平衡对计算的性能、功耗至关重要。而传统的任务调度方法往往很难直接用于异构系统,使之实现负载平衡。 本项目将基于广泛使用的任务窃取方法,针对能够充分利用GPU等加速器的程序进行负载平衡研究。这类程序包括一些化学计算方法(TCE、SCF等)以及n-body问题等。主要研究内容包括:(1)改进任务窃取方案,通过使用不同大小任务、不同处理器任务队列调配等方法,优化窃取机制。(2)对加速器上的任务进行流水化和批处理,以充分利用计算资源,更好地实现负载平衡。(3)根据数据局部性对任务调度进行优化,使共享数据的访问更加高效。本项目的研究充分挖掘异构系统的计算潜力,为未来E级系统上的负载平衡研究提供基础。
中文关键词: GPGPU;并行计算;优化;负载平衡;机器学习
英文摘要: Nowadays, the mainstream supercomputers in the world are not always homogeneous. Many of them are equipped with accelerators such as GPUs. In large scale computation, load balance is a very important factor for performance and energy efficiency. Traditional task scheduling methods are not sufficient for achieving load balance on heterogeneous systems. This project will investigate load balancing method based on work stealing, mainly targeting at applications that can make good use of accelerators like GPUs. These applications include some chemistry computation such as TCE, SCF, and other applications like n-body problem. The main research topics are (1) Modifying the work stealing scheme. By using different task sizes, with coordination among queues for different processors, we can optimize the work stealing algorithms to get better load balance. (2) Using pipelining and kernel consolidation for tasks on accelerators, thus making good use of the computing research, to achieve load balance. (3) Optimizing the scheduling algorithm according to data locality, thus enabling more efficient access to global data. This project will explore the computing power of heterogeneous systems, providing insights for load balancing on Exascale systems.
英文关键词: GPGPU;parallel computing;optimization;load balancing;machine learning