Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field reinforcement learning, which is that all agents in the environment are playing almost similar strategies and have the same goal. We conduct experiments on three different testbeds for the field of many agent reinforcement learning, based on the standard MAgents framework. We consider two different kinds of mean field environments: a) Games where agents belong to predefined types that are known a priori and b) Games where the type of each agent is unknown and therefore must be learned based on observations. We introduce new algorithms for each type of game and demonstrate their superior performance over state of the art algorithms that assume that all agents belong to the same type and other baseline algorithms in the MAgent framework.
翻译:平均场理论为将多试剂强化学习算法推广到环境提供了一种有效的方法,使环境中的多个代理商可以被虚拟平均代理商所抽取。 在本文中,我们将平均值的场外多试算法推广到多种类型。 类型可以使中度场外强化学习的核心假设放松, 即环境中的所有代理商都在玩几乎相似的战略, 目标相同。 我们根据标准MAGents框架, 对多个代理商强化学习领域的三个不同的测试台进行实验。 我们考虑了两种不同的不同的平均场环境: a) 代理商属于事先定义的已知类型; b) 每种代理商类型未知的游戏, 因此必须根据观察来学习。 我们为每种类型的游戏引入新的算法, 并展示其优于假定所有代理商都属于同一类型以及MAGent框架中其他基线算法的艺术算法的状态。