In this paper we present a scalable deep learning framework for finding Markovian Nash Equilibria in multi-agent stochastic games using fictitious play. The motivation is inspired by theoretical analysis of Forward Backward Stochastic Differential Equations (FBSDE) and their implementation in a deep learning setting, which is the source of our algorithm's sample efficiency improvement. By taking advantage of the permutation-invariant property of agents in symmetric games, the scalability and performance is further enhanced significantly. We showcase superior performance of our framework over the state-of-the-art deep fictitious play algorithm on an inter-bank lending/borrowing problem in terms of multiple metrics. More importantly, our approach scales up to 3000 agents in simulation, a scale which, to the best of our knowledge, represents a new state-of-the-art. We also demonstrate the applicability of our framework in robotics on a belief space autonomous racing problem.
翻译:在本文中,我们提出了一个可扩展的深层次学习框架,用以利用假游戏在多试剂随机游戏中找到Markovian Nash Equilibria。动力来自对往后蒸馏式差异的理论分析及其在深层次学习环境中的运用,这是我们算法提高抽样效率的源泉。通过利用对称游戏中代理人的变换和变异特性,可伸缩性和性能得到显著提高。我们展示了我们的框架优于银行间借贷/借款问题的最新深层次虚拟游戏算法的优异性。更重要的是,我们在模拟中采用了高达3000个代理人的尺度,根据我们的知识,这一尺度代表了一种新的状态。我们还展示了我们在机器人中的框架在信仰空间自主赛车问题上的适用性。