Having access to accurate game state information is of utmost importance for any artificial intelligence task including game-playing, testing, player modeling, and procedural content generation. Self-Supervised Learning (SSL) techniques have shown to be capable of inferring accurate game state information from the high-dimensional pixel input of game footage into compressed latent representations. Contrastive Learning is a popular SSL paradigm where the visual understanding of the game's images comes from contrasting dissimilar and similar game states defined by simple image augmentation methods. In this study, we introduce a new game scene augmentation technique -- named GameCLR -- that takes advantage of the game-engine to define and synthesize specific, highly-controlled renderings of different game states, thereby, boosting contrastive learning performance. We test our GameCLR technique on images of the CARLA driving simulator environment and compare it against the popular SimCLR baseline SSL method. Our results suggest that GameCLR can infer the game's state information from game footage more accurately compared to the baseline. Our proposed approach allows us to conduct game artificial intelligence research by directly utilizing screen pixels as input.
翻译:获取准确的游戏状态信息对于任何人工智能任务都至关重要, 包括游戏游戏游戏、测试、玩家建模和程序内容生成。 自我支持学习( SSL) 技术已经证明能够从高维像素将游戏视频输入压缩潜表中推断出准确的游戏状态信息。 对比学习是一个受欢迎的 SSL 模式, 通过这种模式, 对游戏图像的视觉理解来自以简单图像增强方法定义的不同和相似的游戏状态。 在此研究中, 我们引入了一个新的游戏场景增强技术 -- -- 名为 GameCLR -- 利用游戏引擎来定义和合成不同游戏状态的具体、 高度控制的图像, 从而提升对比性学习性能。 我们用 CARLA 驱动模拟环境的图像测试我们的游戏中心状态技术, 并与流行的 SimCLR 基线 SL 方法进行比较。 我们的结果表明, GameCLR 可以将游戏的状态信息从游戏画面与基线进行比较。 我们提议的方法允许我们通过直接使用屏幕像素输入来进行游戏人造智能研究。