Mean field games (MFGs) provide a mathematically tractable framework for modelling large-scale multi-agent systems by leveraging mean field theory to simplify interactions among agents. It enables applying inverse reinforcement learning (IRL) to predict behaviours of large populations by recovering reward signals from demonstrated behaviours. However, existing IRL methods for MFGs are powerless to reason about uncertainties in demonstrated behaviours of individual agents. This paper proposes a novel framework, Mean-Field Adversarial IRL (MF-AIRL), which is capable of tackling uncertainties in demonstrations. We build MF-AIRL upon maximum entropy IRL and a new equilibrium concept. We evaluate our approach on simulated tasks with imperfect demonstrations. Experimental results demonstrate the superiority of MF-AIRL over existing methods in reward recovery.
翻译:普通场游戏(MFGs)为模拟大型多试剂系统提供了一个数学可操作的框架,利用平均场理论简化代理人之间的相互作用;它能够应用反向强化学习(IRL),通过从所显示的行为中收回奖励信号来预测大批人口的行为;然而,现有针对MFG的IR方法无法理清个别代理人所显示的行为的不确定性;本文件提出了一个新颖的框架,即MF-AIRL(MF-AIRL),它能够解决示范中的不确定性;我们用最大易感性IRL和一个新的平衡概念来建立MF-AIRL(MF-AIRL),我们评价我们模拟的示范不完善任务的方法;实验结果表明MF-AIRL在回收奖励方面优于现有方法。