项目名称: 数据密集型计算中副本感知的高效能调度优化机制研究
项目编号: No.61202173
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 刘伟
作者单位: 武汉理工大学
项目金额: 23万元
中文摘要: 随着高性能计算平台的飞速发展,数据密集型计算的应用领域不断延伸,计算复杂度和数据传输量不断增加,由此产生的高能耗问题亟待解决。由于大规模数据集的传输将极大影响此类任务的完成时间和消耗能量,适用于该环境的高效能调度机制必须与数据副本选择和复制策略有机结合。本项目研究数据密集型计算中副本感知的高效能调度优化机制与算法。运用带权重集合覆盖思想,提出一种兼顾访问成本和传输时间的副本选择算法,在减少数据传输量的同时,为用户提供经济的数据资源;建立能体现任务特征的能耗模型,以独立任务集的完成时间和成本预算为约束条件,设计高效能调度算法,在性能、经济成本和能耗之间寻求平衡;提出一种面向低能耗的三阶段动态数据复制算法,旨在实现系统性能、数据可用性和能耗的折衷;研制适用于数据密集型计算环境的仿真实验系统平台,测试提出的各类算法的有效性和性能,并从理论上分析仿真结果,为实际环境中算法的选择和应用提供理论依据。
中文关键词: 数据密集型计算;高效能调度;数据副本选择;数据复制与布局;副本感知
英文摘要: With the rapid development of high performance computing platform and continually expanding of application fields of data-intensive computing, the complexity of computation involved and the volumes of data transferred have been on the rise. The problem of high energy consumption needs to be resolved urgently. Considering the transmission of large data sets will significantly affect the completion time and energy consumption of these tasks, the energy efficient scheduling mechanism for this circumstance must be combined with replica selection and data replication strategies. This project is devoted to the replica-aware energy efficient scheduling optimization mechanism and algorithms for data-intensive computing. Firstly, based on the idea of weighted set covering problem,a replica selection algorithm is proposed taking into consideration both access cost and transmission time. The algorithm can not only reduce the volumes of data transferred, but also provide economic data resources for users. Secondly, an energy consumption model that can reflect the characteristics of tasks is established. In order to make a balance among performance, economic cost and energy consumption, an energy efficient scheduling algorithm is designed under the constraints of the completion time and cost budget of an independent task set
英文关键词: data-intensive computing;energy efficient scheduling;data replica selection;data replication and placement;replica aware