Effective congestion control for data center networks is becoming increasingly challenging with a growing amount of latency sensitive traffic, much fatter links, and extremely bursty traffic. Widely deployed algorithms, such as DCTCP and DCQCN, are still far from optimal in many plausible scenarios, particularly for tail latency. Many operators compensate by running their networks at low average utilization, dramatically increasing costs. In this paper, we argue that we have reached the practical limits of end-to-end congestion control. Instead, we propose, implement, and evaluate a new congestion control architecture called Backpressure Flow Control (BFC). BFC provides per-hop per-flow flow control, but with bounded state, constant-time switch operations, and careful use of buffers. We demonstrate BFC's feasibility by implementing it on a state-of-the-art P4-based programmable hardware switch. In simulation, we show that BFC achieves near optimal throughput and tail latency behavior even under challenging conditions such as high network load and incast cross traffic. Compared to existing end-to-end schemes, BFC achieves 2.3 - 60 X lower tail latency for short flows and 1.6 - 5 X better average completion time for long flows.
翻译:数据中心网络的有效拥堵控制正日益变得日益具有挑战性,因为潜伏敏感交通量、脂肪链路和异常交通量越来越多。广泛部署的算法,如DCTCP和DCQCN,在许多合理的情景中仍然远远不尽理想,特别是尾部悬浮。许多操作员通过运行其网络而以低平均利用率进行补偿,费用急剧增加。在本文中,我们争辩说,我们已达到端到端的拥堵控制的实际限度。相反,我们提议、实施和评估一个新的阻塞控制结构,即回压流动控制(BFC)。BFC提供每股一次流量控制,但有封闭状态、固定时间开关操作和谨慎使用缓冲。我们通过在最先进的P4程序型硬件开关上实施BFC,我们证明BFC即使在高网络负荷和跨流量等具有挑战性的条件下,也几乎实现了最佳的过量和尾拖拉行为。与现有的端-端流动计划相比,BFC实现2.3 - 60 X 低尾部平均完成时间5。