As the number of distributed services (or microservices) of cloud-native applications grows, resource management becomes a challenging task. These applications tend to be user-facing and latency-sensitive, and our goal is to continuously minimize the amount of CPU resources allocated while still satisfying the application latency SLO. Although previous efforts have proposed simple heuristics and sophisticated ML-based techniques, we believe that a practical resource manager should accurately scale CPU resources for diverse applications, with minimum human efforts and operation overheads. To this end, we ask: can we systematically break resource management down to subproblems solvable by practical policies? Based on the notion of CPU-throttle-based performance target, we decouple the mechanisms of SLO feedback and resource control, and implement a two-level framework -- Autothrottle. It combines a lightweight learned controller at the global level, and agile per-microservice controllers at the local level. We evaluate Autothrottle on three microservice applications, with both short-term and 21-day production workload traces. Empirical results show Autothrottle's superior CPU core savings up to 26.21% over the best-performing baselines across applications, while maintaining the latency SLO.
翻译:随着云端应用的分布服务(或微观服务)数量的增长,资源管理就成了一项艰巨的任务。这些应用往往倾向于以用户为主和延缓性敏感度为主,我们的目标是不断尽量减少分配的CPU资源量,同时仍然满足应用延缓 SLO。虽然先前的努力提出了简单的超速和基于ML的尖端技术,但我们认为,实用资源管理者应该精确地为多种应用配置CPU资源,同时尽量减少人力努力和业务管理间接费用。为此,我们询问:我们能否系统地将资源管理分解为实际政策可以解决的子问题?基于基于CPU-throtle业绩目标的概念,我们的目标是不断减少CPU-throtle反馈和资源控制机制的配置,并实施两级框架 -- -- Autothrotol。它结合了全球一级的轻量级知识控制器,以及地方一级的微缩缩放控制器。我们评估三个微缩服务应用程序的自动节纹,短期和21天的生产工作量都有跟踪。根据CPU-CMLO-B-LO-B-SLO-C-B-Stor-Bral-listal-listal listal listal listal listral listal listal listal listal restal restal restrislate restritational restrislatemental)