项目名称: 异构GPU集群混合粒度任务协同调度与动态均衡机制研究
项目编号: No.61202005
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 李涛
作者单位: 南开大学
项目金额: 22万元
中文摘要: GPU集群计算技术是目前国内外高性能计算研究的热点,对生物、金融、气象等需要进行大规模数据处理的领域具有重要意义。虽然通用并行计算架构如CUDA能够有效地发挥GPU的计算能力,但这些加速(协)处理器的使用带来了新的通信和存储等问题,使GPU集群整体的计算能力难以得到高效利用。本课题从计算、通信和存储三方面考虑与性能相关的体系结构因素,建立适用于异构GPU集群的性能模型;通过分析GPU集群上典型应用的计算模式,提出面向异构GPU集群的混合粒度任务模型,在此基础上实现任务协同调度与动态均衡机制;基于分布式数据管理机制和高效通信机制实现支持多种调度策略的高性能计算框架,并采用大规模计算问题进行效能测试与调优。课题针对GPU带来的新结构特征和编程模型,从提高异构GPU集群计算效能角度提出混合粒度任务调度与动态均衡机制,为异构GPU集群大规模计算研究和应用提供新思路和方法。
中文关键词: GPU集群;混合粒度;协同调度;动态均衡;CUDA
英文摘要: GPU cluster computing technology is a research hotspot in the community of high performance computing nowadays. It is playing an important role in biology, finance, meteorology and any other areas which need large-scale data processing. Although the general parallel computing architecture such as CUDA can effectively exert the computing power of GPUs, the usage of these accelerators also brings new problems, such as CPU-GPU and GPU-GPU communication, data storage, and so on. The whole computing power of GPU cluster cannot be used efficiently. The project builds a performance model suitable for the GPU clusters from three aspects: computation, communication and data storage, regarding the architecture factors that affect performance. Then, the project proposes a hybrid-grained task model for the heterogenous GPU clusters by analyzing the computing model of typical applications executed on GPU clusters, including the concurrency in and between threads and the CPU-GPU cooperation mechanism. The porject also proposes the co-scheduling and dynamic load balancing mechanisms based on the task model. In the end, the project implements a high performance computing framework to support multiple scheduling strategies, which are based on the distributed data management and highly efficient communication. Its performance is
英文关键词: GPU cluster;hybrid-grained;co-scheduling;dynamic balancing;CUDA