Although the field of multi-agent reinforcement learning (MARL) has made considerable progress in the last years, solving systems with a large number of agents remains a hard challenge. Graphon mean field games (GMFGs) enable the scalable analysis of MARL problems that are otherwise intractable. By the mathematical structure of graphons, this approach is limited to dense graphs which are insufficient to describe many real-world networks such as power law graphs. Our paper introduces a novel formulation of GMFGs, called LPGMFGs, which leverages the graph theoretical concept of $L^p$ graphons and provides a machine learning tool to efficiently and accurately approximate solutions for sparse network problems. This especially includes power law networks which are empirically observed in various application areas and cannot be captured by standard graphons. We derive theoretical existence and convergence guarantees and give empirical examples that demonstrate the accuracy of our learning approach for systems with many agents. Furthermore, we rigorously extend the Online Mirror Descent (OMD) learning algorithm to our setup to accelerate learning speed, allow for agent interaction through the mean field in the transition kernel, and empirically show its capabilities. In general, we provide a scalable, mathematically well-founded machine learning approach to a large class of otherwise intractable problems of great relevance in numerous research fields.
翻译:虽然多试剂加固学习领域在过去几年里取得了相当大的进展,但用大量代理商解决系统仍然是一个艰巨的挑战。Greamon 平均野外游戏(GMFGs)能够对本来难以解决的MARL问题进行可扩缩的分析。根据图形的数学结构,这一方法仅限于密集的图形,不足以描述许多真实世界的网络,例如电法图。我们的论文介绍了一种新型的GMFGs(称为LPGMFGGs)配制,它利用了$L ⁇ p$gons的图形理论概念,并提供了一个机器学习工具,以高效和准确地近似地解决分散的网络问题。这特别包括在不同应用领域以经验方式观测到的、无法被标准图形所捕捉的电力法律网络。我们从理论上存在和趋同的保证,并提供了经验性实例,表明我们对许多代理商的系统学习方法的准确性。此外,我们严格地将在线镜源(OMD)学习算法扩展到我们的设置,以加快学习速度,允许代理人通过中间流中的平均字段进行互动,并用实验性地展示其巨大的研究领域的重要性。一般来说,我们提供了一种非常可靠的机器学习。