This paper presents the network load balancing problem, a challenging real-world task for multi-agent reinforcement learning (MARL) methods. Traditional heuristic solutions like Weighted-Cost Multi-Path (WCMP) and Local Shortest Queue (LSQ) are less flexible to the changing workload distributions and arrival rates, with a poor balance among multiple load balancers. The cooperative network load balancing task is formulated as a Dec-POMDP problem, which naturally induces the MARL methods. To bridge the reality gap for applying learning-based methods, all methods are directly trained and evaluated on an emulation system from moderate-to large-scale. Experiments on realistic testbeds show that the independent and "selfish" load balancing strategies are not necessarily the globally optimal ones, while the proposed MARL solution has a superior performance over different realistic settings. Additionally, the potential difficulties of MARL methods for network load balancing are analysed, which helps to draw the attention of the learning and network communities to such challenges.
翻译:本文介绍了网络负荷平衡问题,这是多试剂强化学习(MARL)方法的一个具有挑战性的现实世界任务。传统的超常解决方案,如加权成本多帕和地方最短格(LSQ),对不断变化的工作量分配和抵达率不那么灵活,在多个负载平衡器之间则差强人意。合作网络负荷平衡任务是一个12月-POMDP问题,这自然会引发MARL方法。为了弥合应用基于学习方法的现实差距,所有方法都直接经过培训,在中大尺度的模拟系统上进行了评估。对现实测试床的实验显示,独立和“自私”的负载平衡战略不一定是全球最佳战略,而拟议的MARL解决方案在不同现实环境中的绩效优异。此外,对MARL网络负荷平衡方法的潜在困难进行了分析,这有助于吸引学习界和网络界对此类挑战的注意。