Connected Autonomous Vehicles will make autonomous intersection management a reality replacing traditional traffic signal control. Autonomous intersection management requires time and speed adjustment of vehicles arriving at an intersection for collision-free passing through the intersection. Due to its computational complexity, this problem has been studied only when vehicle arrival times towards the vicinity of the intersection are known beforehand, which limits the applicability of these solutions for real-time deployment. To solve the real-time autonomous traffic intersection management problem, we propose a reinforcement learning (RL) based multiagent architecture and a novel RL algorithm coined multi-discount Q-learning. In multi-discount Q-learning, we introduce a simple yet effective way to solve a Markov Decision Process by preserving both short-term and long-term goals, which is crucial for collision-free speed control. Our empirical results show that our RL-based multiagent solution can achieve near-optimal performance efficiently when minimizing the travel time through an intersection.
翻译:自主交叉管理需要时间和速度调整到达交叉路口的车辆,以便不发生碰撞,通过交叉路口。由于计算的复杂性,只有在事先知道车辆到达时,才能研究这一问题,这限制了这些解决方案适用于实时部署。为了解决基于实时自主交通交叉管理问题,我们建议建立一个基于多试剂的强化学习(RL)多功能结构,以及一个新的RL算法,即多折扣的多位数Q学习。在多折扣的Q学习中,我们引入了一个简单而有效的方法,通过维护短期和长期目标来解决Markov决定程序,这对于不发生碰撞的速度控制至关重要。我们的经验结果表明,我们的基于RL的多试剂解决方案在通过交叉点尽量减少旅行时间时,可以取得接近最佳的性能。