StarCraft II (SC2) is a real-time strategy game, in which players produce and control multiple units to win. Due to its difficulties, such as huge state space, various action space, a long time horizon, and imperfect information, SC2 has been a research highlight in reinforcement learning research. Recently, an SC2 agent called AlphaStar is proposed which shows excellent performance, obtaining a high win-rates of 99.8% against Grandmaster level human players. We implemented a mini-scaled version of it called mini-AlphaStar based on their paper and the pseudocode they provided. The usage and analysis of it are shown in this technical report. The difference between AlphaStar and mini-AlphaStar is that we substituted the hyper-parameters in the former version with much smaller ones for mini-scale training. The codes of mini-AlphaStar are all open-sourced. The objective of mini-AlphaStar is to provide a reproduction of the original AlphaStar and facilitate the future research of RL on large-scale problems.
翻译:StarCraft II(SC2)是一个实时战略游戏,玩家在游戏中生产和控制多个单位以获胜。由于其困难,如巨大的国家空间、各种行动空间、漫长的时间跨度和不完善的信息,SC2一直是强化学习研究的突出研究。最近,提出了名为AlphaStar(AlphaStar)的SC2代理物,该代理物表现优异,对巨型人类玩家的双赢率高达99.8%。我们根据他们提供的纸张和假代码实施了称为微型阿尔法Star(Mini-AlphaStar)的小型版本。其使用和分析见本技术报告。阿尔法Star(AlphaStar)和Mini-AlphaStar(Mini-AlphaStar)的区别在于,我们用较小型的培训用较小型的参数取代了前版本的超参数。小型AlphaStar的代码都是开放的。小型AlphaStar(AlphaStar)的目的是提供原始阿尔法Star(MahStar)的复制,并便利未来对大型问题进行RL(RL)的研究。