Non-cooperative and cooperative games with a very large number of players have many applications but remain generally intractable when the number of players increases. Introduced by Lasry and Lions, and Huang, Caines and Malham\'e, Mean Field Games (MFGs) rely on a mean-field approximation to allow the number of players to grow to infinity. Traditional methods for solving these games generally rely on solving partial or stochastic differential equations with a full knowledge of the model. Recently, Reinforcement Learning (RL) has appeared promising to solve complex problems. By combining MFGs and RL, we hope to solve games at a very large scale both in terms of population size and environment complexity. In this survey, we review the quickly growing recent literature on RL methods to learn Nash equilibria in MFGs. We first identify the most common settings (static, stationary, and evolutive). We then present a general framework for classical iterative methods (based on best-response computation or policy evaluation) to solve MFGs in an exact way. Building on these algorithms and the connection with Markov Decision Processes, we explain how RL can be used to learn MFG solutions in a model-free way. Last, we present numerical illustrations on a benchmark problem, and conclude with some perspectives.
翻译:由Lasry和Lions以及Huang、Caines和Malham\'e, Speat Field Field Meal Composes(MFGs)推出的游戏以平均场近近近,使玩家数量增长到无限。解决这些游戏的传统方法一般依靠完全了解模型,解决部分或随机差异方程。最近,强化学习(RL)似乎有解决复杂问题的希望。通过将MFGs和RL结合起来,我们希望在人口规模和环境复杂性方面大规模解决游戏。在这次调查中,我们审查了关于RL方法的快速增长文献,以学习MFGs中的纳什电子平衡。我们首先确定了最常见的环境(静态、静态和动态)。我们随后提出了一个传统迭代方法(基于最佳反应计算或政策评价)的总体框架,以便用准确的方式解决MFGs。我们从这些算法和与Markov Dimical Processionals的链接,我们从这些算算法和与当前MFMF IMF imal imal Prostrical) 的方法,我们如何理解如何在目前的模型上找到一个问题。