Reinforcement learning (RL) for traffic signal control (TSC) has shown better performance in simulation for controlling the traffic flow of intersections than conventional approaches. However, due to several challenges, no RL-based TSC has been deployed in the field yet. One major challenge for real-world deployment is to ensure that all safety requirements are met at all times during operation. We present an approach to ensure safety in a real-world intersection by using an action space that is safe by design. The action space encompasses traffic phases, which represent the combination of non-conflicting signal colors of the intersection. Additionally, an action masking mechanism makes sure that only appropriate phase transitions are carried out. Another challenge for real-world deployment is to ensure a control behavior that avoids stress for road users. We demonstrate how to achieve this by incorporating domain knowledge through extending the action masking mechanism. We test and verify our approach in a realistic simulation scenario. By ensuring safety and psychologically pleasant control behavior, our approach drives development towards real-world deployment of RL for TSC.
翻译:用于交通信号控制(TSC)的强化学习(RL)在控制交叉路口交通流动的模拟中表现优于常规方法。然而,由于一些挑战,尚未在外地部署基于RL的海训系统。现实世界部署的一个主要挑战是确保操作期间任何时候都满足所有安全要求。我们提出了一个办法,通过使用设计上安全的动作空间来确保真实世界交叉点的安全。行动空间包括交通阶段,这代表着交叉点非冲突信号颜色的组合。此外,行动掩蔽机制确保只进行适当的阶段过渡。现实世界部署的另一个挑战是确保一种控制行为,避免道路使用者的压力。我们通过扩展行动掩蔽机制来证明如何实现这一点。我们用现实的模拟假设来测试和验证我们的方法。通过确保安全和心理上舒适的控制行为,我们的方法推动发展为TSC在现实世界部署RL。