Consolidating latency-critical (LC) and best-effort (BE) tenants at storage backend helps to increase resources utilization. Even if tenants use dedicated queues and threads to achieve performance isolation, threads are still contend for CPU cores. Therefore, we argue that it is necessary to partition cores between LC and BE tenants, and meanwhile each core is dedicated to run a thread. Expect for frequently changing bursty load, fluctuated service time at storage backend also drastically changes the need of cores. In order to guarantee tail latency service level objectives (SLOs), the abrupt changing need of cores must be satisfied immediately. Otherwise, tail latency SLO violation happens. Unfortunately, partitioning-based approaches lack the ability to react the changing need of cores, resulting in extreme spikes in latency and SLO violation happens. In this paper, we present QWin, a tail latency SLO aware core allocation to enforce tail latency SLO at shared storage backend. QWin consists of an SLO-to-core calculation model that accurately calculates the number of cores combining with definitive runtime load determined by a flexible request-based window, and an autonomous core allocation that adjusts cores at adaptive frequency by dynamically changing core policies. When consolidating multiple LC and BE tenants, QWin outperforms the-state-of-the-art approaches in guaranteeing tail latency SLO for LC tenants and meanwhile increasing bandwidth of BE tenants by up to 31x.
翻译:存储后端的封闭临界值(LC)和最佳努力(BE)租户在存储后端的固凝临界值(LC)和最佳努力(BE)的整合后端有助于增加资源利用。即使租户使用专门的排队和线条来实现绩效隔离,但对于CPU核心仍然在争斗。因此,我们认为,有必要将核心核心部分在LC和BE租户之间进行分割,同时每个核心部分都专门用来运行线条。对于经常变化的溢漏负负值,存储后端的尾部悬浮服务时间也会大幅改变核心部分的需求。为了保证尾部延迟服务水平的目标(SLO),核心部分的突然变化的需要必须马上得到满足。 否则,尾部延迟的液流线线条违反SLO原则的情况会发生。不幸的是,基于分流基点的办法缺乏应对核心核心部分变化的需求的能力,导致潜压和SLLLF违反。我们介绍了QWin,尾部了解尾部的尾部液线条核心分配情况,以便在共享的储存后端端端端端端端端端端,由基于基于SLO-核心的计算模型的计算模型计算模型计算模型,准确地计算到不断稳定地计算出核心频率配置,在不断调整的递式的递式的递增压的递式的递增的递增核心部分,在不断调整的递增压式的递定的递式的递式的递增的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递式的递增。