Incast traffic in data centers can lead to severe performance degradation, such as packet loss and increased latency. Effectively addressing incast requires prompt and accurate detection. Existing solutions, including MA-ECN, BurstRadar and Pulser, typically rely on fixed thresholds of switch port egress queue lengths or their gradients to identify microburst caused by incast flows. However, these queue length related methods often suffer from delayed detection and high error rates. In this study, we propose a distributed incast detection method for data center networks at the switch-level, leveraging a probabilistic hypothesis test with an optimal detection threshold. By analyzing the arrival intervals of new flows, our algorithm can immediately determine if a flow is part of an incast traffic from its initial packet. The experimental results demonstrate that our method offers significant improvements over existing approaches in both detection speed and inference accuracy.
翻译:数据中心中的Incast流量可能导致严重的性能下降,例如数据包丢失和延迟增加。有效应对Incast需要及时且准确的检测。现有解决方案,包括MA-ECN、BurstRadar和Pulser,通常依赖交换机端口出口队列长度或其梯度的固定阈值来识别由Incast流引起的微突发。然而,这些与队列长度相关的方法往往存在检测延迟和高错误率的问题。在本研究中,我们提出了一种用于数据中心网络的交换机级分布式Incast检测方法,利用概率假设检验与最优检测阈值。通过分析新流的到达间隔,我们的算法能够从流的初始数据包立即判断该流是否属于Incast流量。实验结果表明,我们的方法在检测速度和推理准确性方面均显著优于现有方法。