We introduce ApolloRL, an open platform for research in reinforcement learning for autonomous driving. The platform provides a complete closed-loop pipeline with training, simulation, and evaluation components. It comes with 300 hours of real-world data in driving scenarios and popular baselines such as Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) agents. We elaborate in this paper on the architecture and the environment defined in the platform. In addition, we discuss the performance of the baseline agents in the ApolloRL environment.
翻译:我们介绍阿波罗-L,这是一个研究强化学习自主驾驶的开放平台。该平台提供了一个完整的闭路管道,包含培训、模拟和评价部分。它包含300小时的真实世界数据,用于驾驶情景和流行基线,如Proximal政策优化和Soft Actor-Critic(SAC)代理。我们在本文中详细阐述了该平台界定的架构和环境。此外,我们还讨论了阿波罗-L环境中基线代理物的性能。