A growing number of service providers are exploring methods to improve server utilization, reduce power consumption, and reduce total cost of ownership by co-scheduling high-priority latency-critical workloads with best-effort workloads. This practice requires strict resource allocation between workloads to reduce resource contention and maintain Quality of Service (QoS) guarantees. Prior resource allocation works have been shown to improve server utilization under ideal circumstances, yet often compromise QoS guarantees or fail to find valid resource allocations in more dynamic operating environments. Further, prior works are fundamentally reliant upon QoS measurements that can, in practice, exhibit significant transient fluctuations, thus stable control behavior cannot be reliably achieved. In this paper, we propose a novel framework for dynamic resource allocation based on proactive QoS prediction. These predictions help guide a reinforcement-learning-based resource controller towards optimal resource allocations while avoiding transient QoS violations due to fluctuating workload demands. Evaluation shows that the proposed method incurs 4.3x fewer QoS violations, reduces severity of QoS violations by 3.7x, improves best-effort workload performance, and improves overall power efficiency compared with prior work.
翻译:越来越多的服务提供者正在探索改进服务器利用率、减少电力消耗和降低总所有权成本的方法,共同安排高优先长期关键工作量,以最佳工作负荷,从而降低总所有权成本。这种做法要求在工作量之间严格分配资源,以减少资源争议,并保持服务质量保障。以往的资源分配工作表明,在理想情况下可以改进服务器的利用率,但往往会损害QOS保证,或无法在更动态的操作环境中找到有效的资源分配。此外,以前的工作基本上依赖于QOS测量,这种测量在实践中能够显示显著的瞬间波动,因此无法可靠地实现稳定的控制行为。我们在本文件中提出了基于预防性QOS预测的动态资源分配新框架。这些预测有助于指导基于强化学习的资源管理人员实现最佳资源分配,同时避免因工作量波动而出现短期违反QOS的情况。评价表明,拟议的方法造成4.3x违反QOS的情况减少,降低QOS违反情况的严重性3.7x,改进最佳控制工作量的绩效,提高与先前工作相比的总体能力。