The increasing use of cloud computing for latency-sensitive applications has sparked renewed interest in providing tight bounds on network tail latency. Achieving this in practice at reasonable network utilization has proved elusive, due to a combination of highly bursty application demand, faster link speeds, and heavy-tailed message sizes. While priority scheduling can be used to reduce tail latency for some traffic, this comes at a cost of much worse delay behavior for all other traffic on the network. Most operators choose to run their networks at very low average utilization, despite the added cost, and yet still suffer poor tail behavior. This paper takes a different approach. We build a system, swp, to help operators (and network designers) to understand and control tail latency without relying on priority scheduling. As network workload changes, swp is designed to give real-time advice on the network switch configurations needed to maintain tail latency objectives for each traffic class. The core of swp is an efficient model for simulating the combined effect of traffic characteristics, end-to-end congestion control, and switch scheduling on service-level objectives (SLOs), along with an optimizer that adjusts switch-level scheduling weights assigned to each class. Using simulation across a diverse set of workloads with different SLOs, we show that to meet the same SLOs as swp provides, FIFO would require 65% greater link capacity, and 79% more for scenarios with tight SLOs on bursty traffic classes.
翻译:使用云计算对延时敏感应用的云层越来越多,这又重新激起人们对提供网络尾部延时的紧限的兴趣。在合理使用网络的情况下在实践中实现这一点被证明难以实现,因为应用需求高度爆炸性,链接速度加快,而且信息量也繁琐。虽然可以使用优先时间安排来减少某些交通的尾部延时,但对于网络上所有其他交通来说,其代价更差得多。大多数运营商选择以非常低的平均利用率运行其网络,尽管成本增加,但仍然遭受不良的尾部行为。本文采取了不同的做法。我们建立了一个系统,Swpp,以帮助运营商(和网络设计师)理解和控制尾部延时不依赖优先时间安排。随着网络工作量的变化,Swp被设计为保持每类交通尾部的尾部延时节配置提供实时咨询。Swpp是模拟交通特点的综合效应、终端到尾部拥堵时,将服务水平(SLOP)调整到服务水平上(SLO),同时以不同的SLFA为不同比例的排序,通过不同的SLFA级别,提供更高的SLFS-roqs 。