When parallelizing a set of jobs across many servers, one must balance a trade-off between granting priority to short jobs and maintaining the overall efficiency of the system. When the goal is to minimize the mean flow time of a set of jobs, it is usually the case that one wants to complete short jobs before long jobs. However, since jobs usually cannot be parallelized with perfect efficiency, granting strict priority to the short jobs can result in very low system efficiency which in turn hurts the mean flow time across jobs. In this paper, we derive the optimal policy for allocating servers to jobs at every moment in time in order to minimize mean flow time across jobs. We assume that jobs follow a sublinear, concave speedup function, and hence jobs experience diminishing returns from being allocated additional servers. We show that the optimal policy, heSRPT, will complete jobs according to their size order, but maintains overall system efficiency by allocating some servers to each job at every moment in time. We compare heSRPT with state-of-the-art allocation policies from the literature and show that heSRPT outperforms its competitors by at least 30%, and often by much more.
翻译:在将许多服务器的一组工作平行化时,必须平衡兼顾对短期工作给予优先考虑和保持系统总体效率之间的权衡。当目标是最大限度地减少一组工作的平均流动时间时,通常就是想要在长期工作之前完成短期工作。然而,由于工作通常不能以完美效率平行化,对短期工作给予严格的优先考虑可能导致系统效率非常低,这反过来又会损害各工作的平均流动时间。在本文中,我们制定了将服务器分配给每个工作的最佳政策,以便尽可能减少各工作之间的平均流动时间。我们假定,工作遵循一个子线性、配置加速功能,从而减少从分配额外服务器中获得的回报的工作经验。我们表明,最佳政策,即HERPT, 将按其规模顺序完成工作,但保持整个系统效率,即每时每刻分配一些服务器。我们将HERPT与文献中最先进的分配政策进行比较,并表明,HERPT比其竞争对手至少超过30%,而且往往超过30%。