Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be realized. Existing state-of-the-art solutions fail to consider the effect of communication delays on the behaviour of very large systems with many clients. In this work, we consider a multi-agent load balancing system, with delayed information, consisting of many clients (load balancers) and many parallel queues. In order to obtain a tractable solution, we model this system as a mean-field control problem with enlarged state-action space in discrete time through exact discretization. Subsequently, we apply policy gradient reinforcement learning algorithms to find an optimal load balancing solution. Here, the discrete-time system model incorporates a synchronization delay under which the queue state information is synchronously broadcasted and updated at all clients. We then provide theoretical performance guarantees for our methodology in large systems. Finally, using experiments, we prove that our approach is not only scalable but also shows good performance when compared to the state-of-the-art power-of-d variant of the Join-the-Shortest-Queue (JSQ) and other policies in the presence of synchronization delays.
翻译:近些年来,数据中心和云服务的能力和平行处理能力都大大增加。为了充分利用上述分布式系统,必须实现平行排队结构的最佳平衡。现有最先进的解决方案没有考虑到通信延误对与许多客户的大型系统行为的影响。在这项工作中,我们考虑多剂负载平衡系统,该系统包含许多客户(负载平衡器)和许多平行队列的延迟信息。为了获得可移植的解决方案,我们以该系统为模型,作为平均地区控制问题,通过精确的离散化,扩大国家行动空间。随后,我们采用政策梯度强化学习算法,寻找最佳的负载平衡解决方案。这里,离散时间系统模型包含同步延迟,所有客户都同步播放和更新队列状态信息。然后,我们为大型系统的方法提供理论上的绩效保障。最后,我们通过实验,证明我们的方法不仅可以伸缩,而且显示在与当前最先进的实力和同步政策的其他变式(J-Q)时,我们的方法表现良好。