The OpenAI Gym project contains hundreds of control problems whose goal is to provide a testbed for reinforcement learning algorithms. One such problem is Freeway-ram-v0, where the observations presented to the agent are 128 bytes of RAM. While the goals of the project are for non-expert AI agents to solve the control problems with general training, in this work, we seek to learn more about the problem, so that we can better evaluate solutions. In particular, we develop on oracle to play the game, so that we may have baselines for success. We present details of the oracle, plus optimal game-playing situations that can be used for training and testing AI agents.
翻译:OpenAI Gym项目包含数以百计的控制问题,目标是为强化学习算法提供一个测试台。其中一个问题是Freiway-ram-v0, 向代理提供的观测结果为 RAM 128 字节。 虽然该项目的目标是让非专家AI 代理人员通过一般性培训解决控制问题,但我们在这项工作中寻求更多地了解问题,以便我们更好地评估解决方案。特别是,我们开发游戏的奥克莱,以便我们可以有成功的基准。我们介绍了神器的细节,以及可用于培训和测试AI 代理的游戏场景。