AlphaStar, the AI that reaches GrandMaster level in StarCraft II, is a remarkable milestone demonstrating what deep reinforcement learning can achieve in complex Real-Time Strategy (RTS) games. However, the complexities of the game, algorithms and systems, and especially the tremendous amount of computation needed are big obstacles for the community to conduct further research in this direction. We propose a deep reinforcement learning agent, StarCraft Commander (SCC). With order of magnitude less computation, it demonstrates top human performance defeating GrandMaster players in test matches and top professional players in a live event. Moreover, it shows strong robustness to various human strategies and discovers novel strategies unseen from human plays. In this paper, we will share the key insights and optimizations on efficient imitation learning and reinforcement learning for StarCraft II full game.
翻译:阿尔法斯塔(AlphaStar)是位于StarCraft II的GrandMaster级的AI,是一个显著的里程碑,表明在复杂的实时战略(RTS)游戏中,深度强化学习能够取得什么成就。然而,游戏、算法和系统的复杂性,特别是所需的大量计算,是社区朝这个方向进行进一步研究的巨大障碍。我们提议一个深层强化学习代理,StarCraft 指挥官(SCC)。以数量较少的计算,它展示了在测试比赛中击败GrandMaster球员的顶级人性表现,以及在现场活动中击败顶级专业球员。此外,它显示了各种人类战略的强大强健性,并发现了人类游戏中看不见的新战略。在本文中,我们将分享关于高效模仿学习和加强StarCraft II全局游戏的关键洞察力和优化。