Recently, theory of mean-field games (MFGs) has experienced an exponential growth. However, existing analytical approaches are by and large restricted to contractive or monotone settings, or with an a priori assumption of the uniqueness of the Nash equilibrium (NE) solution for computational feasibility. This paper proposes a new mathematical framework to analyze discrete-time MFGs with none of these restrictions. The key idea is to reformulate the problem of finding NE solutions in MFGs as solving (equivalently) an optimization problem with bounded variables and simple convex constraints. This is built on the classical work of reformulating a Markov decision process (MDP) as a linear program, and by adding the consistency constraint for MFGs in terms of occupation measures, and by exploiting the complementarity structure of the linear program. Under proper regularity conditions for the rewards and the dynamics of the game, the corresponding framework, called MF-OMO (Mean-Field Occupation Measure Optimization), is shown to provide convergence guarantees for finding multiple (and possibly all) NE solutions of MFGs by popular algorithms such as projected gradient descent. In particular, we show that analyzing the class of MFGs with linear rewards and mean-field independent dynamics can be reduced to solving a finite number of linear programs, hence solved in finite time. This optimization framework can be easily extended for variants of MFGs, including but not limited to personalized MFGs and multi-population MFGs.
翻译:最近,平均场游戏(MFGs)理论经历了指数式增长,然而,现有的分析方法基本上限于合同性或单调环境,或先验地假定纳什均衡(NE)办法在计算可行性方面的独特性。本文件提出一个新的数学框架,分析离散时间MFG(没有这些限制),关键的想法是重新界定在MFGs中寻找NE解决方案的问题,以解决(等价)受约束变量和简单的convex限制的优化问题。这是建立在将Markov决策过程(MDP)作为线性程序重新拟订的典型工作的基础上,并先验地假定纳什均衡办法在计算可行性方面的独特性。本文件提出一个新的数学框架,以分析离散时间MFG(N)办法的连贯性,同时利用线性方案的互补性。在适当的定期条件下,称为MF-OMO(M-MW(ME-F)衡量最佳化措施),这为找到多种(和可能全部)MFG(MFG)解决办法的趋同性(MFMFMF-MF-ML)模式的趋同预测的伸缩度框架,具体地显示,我们通过预测的货币-MFMFMFMF-rF-rF-ral-ral-ral-ralalalalalal-resmlal-minalalalalismals可以独立地分析,可以以以直线性原则的伸根。