Asymmetrical multiplayer (AMP) game is a popular game genre which involves multiple types of agents competing or collaborating with each other in the game. It is difficult to train powerful agents that can defeat top human players in AMP games by typical self-play training method because of unbalancing characteristics in their asymmetrical environments. We propose asymmetric-evolution training (AET), a novel multi-agent reinforcement learning framework that can train multiple kinds of agents simultaneously in AMP game. We designed adaptive data adjustment (ADA) and environment randomization (ER) to optimize the AET process. We tested our method in a complex AMP game named Tom \& Jerry, and our AIs trained without using any human data can achieve a win rate of 98.5% against top human players over 65 matches. The ablation experiments indicated that the proposed modules are beneficial to the framework.
翻译:异部多人游戏是一种受欢迎的游戏类型,涉及多种代理之间在游戏中竞争或协作。由于在异部环境中存在不平衡特征,因此通过典型的自我对战培训方法,训练强大的代理以在异部多人游戏中击败顶尖人类玩家是困难的。我们提出了异部进化训练 (AET) ,这是一种新颖的多代理强化学习框架,可以同时训练多种代理在异部多人游戏中。我们设计了自适应数据调整 (ADA) 和环境随机化 (ER) 来优化AET过程。我们在一种名为Tom & Jerry的复杂异部多人游戏中测试了我们的方法,我们训练出的AI在未使用任何人类数据的情况下,在65个比赛中可实现98.5%的胜率,击败顶尖的人类玩家。消融实验表明,所提出的模块对框架有益。