This paper investigates the network load balancing problem in data centers (DCs) where multiple load balancers (LBs) are deployed, using the multi-agent reinforcement learning (MARL) framework. The challenges of this problem consist of the heterogeneous processing architecture and dynamic environments, as well as limited and partial observability of each LB agent in distributed networking systems, which can largely degrade the performance of in-production load balancing algorithms in real-world setups. Centralised-training-decentralised-execution (CTDE) RL scheme has been proposed to improve MARL performance, yet it incurs -- especially in distributed networking systems, which prefer distributed and plug-and-play design scheme -- additional communication and management overhead among agents. We formulate the multi-agent load balancing problem as a Markov potential game, with a carefully and properly designed workload distribution fairness as the potential function. A fully distributed MARL algorithm is proposed to approximate the Nash equilibrium of the game. Experimental evaluations involve both an event-driven simulator and real-world system, where the proposed MARL load balancing algorithm shows close-to-optimal performance in simulations, and superior results over in-production LBs in the real-world system.
翻译:本文调查了使用多剂加固学习(MARL)框架的多重负载平衡器(LBs)部署的数据中心(DCs)的网络负荷平衡问题,这一问题的挑战包括:不同的处理结构和动态环境,以及分布式网络系统中每个LB代理器有限和部分的可观测性,这在很大程度上会降低实际世界设置中生产中负载平衡算法的性能。提出了集中培训分散执行RL(CTDE)计划,以改善MARL的性能,但是它产生了 -- -- 特别是在分布式网络系统,这些系统更喜欢分布式网络和插件设计方案 -- -- 代理商之间的额外通信和管理管理管理管理管理。我们把多剂负荷平衡问题作为Markov潜在游戏加以设计,仔细和适当地设计工作量分配的公平性作为潜在功能。建议完全分配的MAL算法可以接近游戏的纳什平衡性。实验性评价涉及由事件驱动的模拟器和实际世界系统,其中拟议的MAL负载平衡算法在模拟中显示实际生产中的接近性-optimal-Worlds。