This paper introduces Honor of Kings Arena, a reinforcement learning (RL) environment based on Honor of Kings, one of the world's most popular games at present. Compared to other environments studied in most previous work, ours presents new generalization challenges for competitive reinforcement learning. It is a multi-agent problem with one agent competing against its opponent; and it requires the generalization ability as it has diverse targets to control and diverse opponents to compete with. We describe the observation, action, and reward specifications for the Honor of Kings domain and provide an open-source Python-based interface for communicating with the game engine. We provide twenty target heroes with a variety of tasks in Honor of Kings Arena and present initial baseline results for RL-based methods with feasible computing resources. Finally, we showcase the generalization challenges imposed by Honor of Kings Arena and possible remedies to the challenges. All of the software, including the environment-class, are publicly available at https://github.com/tencent-ailab/hok_env . The documentation is available at https://aiarena.tencent.com/hok/doc/ .
翻译:本文介绍以国王荣誉为基础的强化学习环境 " 阿雷纳国王荣誉 ",这是目前世界上最受欢迎的游戏之一。与以往大多数工作中研究的其他环境相比,我们为竞争性强化学习提出了新的一般化挑战。这是一个多代理人问题,一个代理人与其对手竞争;它要求具有普遍化能力,因为它有多种目标来控制和不同对手竞争。我们描述了国王荣誉领域的观察、行动和奖励规格,并为与游戏引擎的沟通提供了一个开放源码的Python界面。我们为阿雷纳国王提供了20个目标英雄,提供了各种任务,并为基于RL的方法提供了初步基线结果,并提供了可行的计算资源。最后,我们展示了阿雷纳国王荣誉带来的一般化挑战以及挑战的可能补救措施。所有软件,包括环境类软件,都可在https://github.com/tencent-alab/hok_env上公开查阅。文件可在https://arena.tencent.com/hok/doc查阅。