Cloud platforms have emerged as a prominent environment to execute high performance computing (HPC) applications providing on-demand resources as well as scalability. They usually offer different classes of Virtual Machines (VMs) which ensure different guarantees in terms of availability and volatility, provisioning the same resource through multiple pricing models. For instance, in Amazon EC2 cloud, the user pays per hour for on-demand VMs while spot VMs are unused instances available for lower price. Despite the monetary advantages, a spot VM can be terminated, stopped, or hibernated by EC2 at any moment. Using both hibernation-prone spot VMs (for cost sake) and on-demand VMs, we propose in this paper a static scheduling for HPC applications which are composed by independent tasks (bag-of-task) with deadline constraints. However, if a spot VM hibernates and it does not resume within a time which guarantees the application's deadline, a temporal failure takes place. Our scheduling, thus, aims at minimizing monetary costs of bag-of-tasks applications in EC2 cloud, respecting its deadline and avoiding temporal failures. To this end, our algorithm statically creates two scheduling maps: (i) the first one contains, for each task, its starting time and on which VM (i.e., an available spot or on-demand VM with the current lowest price) the task should execute; (ii) the second one contains, for each task allocated on a VM spot in the first map, its starting time and on which on-demand VM it should be executed to meet the application deadline in order to avoid temporal failures. The latter will be used whenever the hibernation period of a spot VM exceeds a time limit. Performance results from simulation with task execution traces, configuration of Amazon EC2 VM classes, and VMs market history confirms the effectiveness of our scheduling and that it tolerates temporal failures.
翻译:云层平台已成为执行高性能计算(HPC)应用的显著环境,它提供了需求资源和可缩放性。它们通常提供不同类别的虚拟机器(VMS),确保提供和波动方面的不同保障,通过多种定价模型提供同样的资源。例如,在亚马逊 EC2 云中,用户支付点点点需求VMS的每小时费用,而当点的VMS则是可以降低价格的未使用案例时,尽管有货币优势,但点VMS随时都可以被EC2终止、停止或由EM2 自动移动。它们利用周期(为了成本)和点点需求VM 来提供不同种类的虚拟机器(VM ), 我们提议为HPC 应用程序提供一个固定的固定时间列表, 由独立的任务( bag-task) 提供时间(bag-task) 时间段组成。如果点 VM 点显示我们的应用期限, 时间错误发生。因此, 我们的时间安排旨在尽可能降低EC2 的货币成本, 遵守第一个执行期限, 并避免第一次时间错误发生故障。