In real-time strategy (RTS) game artificial intelligence research, various multi-agent deep reinforcement learning (MADRL) algorithms are widely and actively used nowadays. Most of the research is based on StarCraft II environment because it is the most well-known RTS games in world-wide. In our proposed MADRL-based algorithm, distributed MADRL is fundamentally used that is called QMIX. In addition to QMIX-based distributed computation, we consider state categorization which can reduce computational complexity significantly. Furthermore, self-attention mechanisms are used for identifying the relationship among agents in the form of graphs. Based on these approaches, we propose a categorized state graph attention policy (CSGA-policy). As observed in the performance evaluation of our proposed CSGA-policy with the most well-known StarCraft II simulation environment, our proposed algorithm works well in various settings, as expected.
翻译:在实时战略(RTS)游戏人工智能研究中,各种多剂深度强化学习(MADRL)算法如今被广泛和积极地使用。大多数研究都以StarCraft II环境为基础,因为它是全世界最著名的RTS游戏。在我们拟议的MADRL算法中,分布式MADRL基本上被使用,称为QMIX。除了基于QMIX的分布式计算之外,我们考虑国家分类,这可以大大降低计算的复杂性。此外,还采用自我注意机制,以图表形式确定代理人之间的关系。根据这些方法,我们提出了一个国家图表关注分类政策(CSGA政策 ) 。正如我们提出的与最著名的StarCraft II模拟环境的CSGA政策绩效评估所观察到的那样,我们提议的算法在各种环境中都如预期的那样运作良好。