Recent advances at the intersection of dense large graph limits and mean field games have begun to enable the scalable analysis of a broad class of dynamical sequential games with large numbers of agents. So far, results have been largely limited to graphon mean field systems with continuous-time diffusive or jump dynamics, typically without control and with little focus on computational methods. We propose a novel discrete-time formulation for graphon mean field games as the limit of non-linear dense graph Markov games with weak interaction. On the theoretical side, we give extensive and rigorous existence and approximation properties of the graphon mean field solution in sufficiently large systems. On the practical side, we provide general learning schemes for graphon mean field equilibria by either introducing agent equivalence classes or reformulating the graphon mean field system as a classical mean field system. By repeatedly finding a regularized optimal control solution and its generated mean field, we successfully obtain plausible approximate Nash equilibria in otherwise infeasible large dense graph games with many agents. Empirically, we are able to demonstrate on a number of examples that the finite-agent behavior comes increasingly close to the mean field behavior for our computed equilibria as the graph or system size grows, verifying our theory. More generally, we successfully apply policy gradient reinforcement learning in conjunction with sequential Monte Carlo methods.
翻译:在密集的大型图形限制和平均场外游戏交汇处,最近的进展已开始对一系列广泛的动态相继游戏进行可扩缩的分析,其中含有大量代理人。迄今为止,结果基本上限于图形平均场系统,具有连续时间的显微或跳动动态,通常没有控制,很少注重计算方法。我们建议对图形平均场游戏采用新的独立时间配制,作为非线性密集的显微图马尔科夫游戏的极限,同时互动不力。在理论方面,我们在足够大的系统中对图形平均场溶方进行广泛和严格的存在和近似分析。在实际方面,我们为图形平均的实地平衡提供一般学习计划,要么采用代理等效类,要么将图形平均场系统重塑为典型的显微场系统。我们反复寻找一种正规化的最佳控制解决方案及其生成的平均场域,成功地在与许多代理人的不可行大型密度图形游戏中获得了合理的近似值的纳什微调。我们能够用一些例子来证明,在足够大的系统中,定式代理人的行为越来越接近于平均的实地政策升级,我们逐渐地将逐步地进行实地演算。