As a relatively new form of sport, esports offers unparalleled data availability. Despite the vast amounts of data that are generated by game engines, it can be challenging to extract them and verify their integrity for the purposes of practical and scientific use. Our work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments. These files can be used in statistical and machine learning modeling tasks and related to various laboratory-based measurements (e.g., behavioral tests, brain imaging). We have gathered publicly available game-engine generated "replays" of tournament matches and performed data extraction and cleanup using a low-level application programming interface (API) parser library. Additionally, we open-sourced and published all the custom tools that were developed in the process of creating our dataset. These tools include PyTorch and PyTorch Lightning API abstractions to load and model the data. Our dataset contains replays from major and premiere StarCraft II tournaments since 2016. To prepare the dataset, we processed 55 tournament "replaypacks" that contained 17930 files with game-state information. Based on initial investigation of available StarCraft II datasets, we observed that our dataset is the largest publicly available source of StarCraft II esports data upon its publication. Analysis of the extracted data holds promise for further Artificial Intelligence (AI), Machine Learning (ML), psychological, Human-Computer Interaction (HCI), and sports-related studies in a variety of supervised and self-supervised tasks.
翻译:作为相对较新的体育形式,ESports提供了前所未有的数据。尽管游戏引擎生成了大量数据,但为了实际和科学用途,我们的工作目标是通过提供StarCraft II ESport 比赛的原始和预处理文件,向更广泛的科学界开放Eports。这些文件可用于统计和机器学习模型任务,并与各种实验室测量(例如行为测试、大脑成像)相关。尽管游戏引擎生成了大量数据,但我们收集了公开可得的游戏引擎生成的“再游戏”,并使用低级别应用程序编程接口(API) 皮尔奇和PyTorch Lightning API 抽象文件。我们收集了比赛的“再版”游戏引擎,还收集了数据提取和清理数据,并使用低级别应用程序编程应用程序编程界面(API) 皮尔奇 和 PyTorch Lighting API 的抽象数据。我们的数据集包含自2016年以来主要和初版StarCft II 的自我智能竞赛(主机和首版),我们制作了55次的Star-C Serverial Excial Excial Excial Excial reportal reportal reportal reportal reportal reportal reportal ex reportal II),我们观察到的Starkedudufal II II 数据数据库的在线数据,我们观察了179