Inspired by the recent success of transformers in natural language processing and computer vision applications, we introduce a transformer-based neural architecture for two key StarCraft II (SC2) macromanagement tasks: global state and build order prediction. Unlike recurrent neural networks which suffer from a recency bias, transformers are able to capture patterns across very long time horizons, making them well suited for full game analysis. Our model utilizes the MSC (Macromanagement in StarCraft II) dataset and improves on the top performing gated recurrent unit (GRU) architecture in predicting global state and build order as measured by mean accuracy over multiple time horizons. We present ablation studies on our proposed architecture that support our design decisions. One key advantage of transformers is their ability to generalize well, and we demonstrate that our model achieves an even better accuracy when used in a transfer learning setting in which models trained on games with one racial matchup (e.g., Terran vs. Protoss) are transferred to a different one. We believe that transformers' ability to model long games, potential for parallelization, and generalization performance make them an excellent choice for StarCraft agents.
翻译:受自然语言处理和计算机视觉应用中变压器最近的成功启发,我们为两个关键的StarCraft II(SC2)宏观管理任务引入了一个基于变压器的神经结构:全球状态和构建秩序预测。与经常神经网络不同,变压器能够捕捉非常长的时间跨度的图案,使其非常适合全面游戏分析。我们的模型利用MSC(StarCraft II的Macro管理)数据集,改进了最高性能封闭式经常单元(GRU)的架构,以预测全球状态和构建以多个时空中平均精度衡量的秩序。我们对支持我们设计决策的拟议架构进行减缩研究。变压器的一个主要优势是能够很好地概括,我们证明我们的模型在应用传输学习环境时更加精确。在一次种族匹配(例如Terran vs. Protos)的游戏中训练模型被转移到不同的模型(例如Terran vs. Protos) 。我们认为变压器能够模拟长的游戏、平行化潜力和一般化的动作。