Traffic signal control has the potential to reduce congestion in dynamic networks. Recent studies show that traffic signal control with reinforcement learning (RL) methods can significantly reduce the average waiting time. However, a shortcoming of existing methods is that they require model retraining for new intersections with different structures. In this paper, we propose a novel reinforcement learning approach with augmented data (ADLight) to train a universal model for intersections with different structures. We propose a new agent design incorporating features on movements and actions with set current phase duration to allow the generalized model to have the same structure for different intersections. A new data augmentation method named \textit{movement shuffle} is developed to improve the generalization performance. We also test the universal model with new intersections in Simulation of Urban MObility (SUMO). The results show that the performance of our approach is close to the models trained in a single environment directly (only a 5% loss of average waiting time), and we can reduce more than 80% of training time, which saves a lot of computational resources in scalable operations of traffic lights.
翻译:交通信号控制有可能减少动态网络的拥堵。 最近的研究显示,使用强化学习(RL)方法的交通信号控制可以大大缩短平均等候时间。 但是,现有方法的一个缺点是,它们需要为与不同结构的新交叉点进行示范性再培训。 在本文中,我们建议采用新的强化学习方法,增加数据(ADLight),以培训与不同结构交叉点的普遍模式(ADLight)。我们建议采用新的代理设计,纳入运动和行动特点,并设定当前阶段的持续时间,以使通用模式为不同的交叉点建立相同的结构。正在开发一个新的数据增强方法,名为\textit{movement shaffle},以改善通用性能。我们还用城市移动能力模拟(SUMO)中的新交叉点测试通用模式。结果显示,我们方法的性能接近在单一环境中直接培训的模式(只有5%的平均等待时间损失),我们可以减少80%以上的培训时间,从而节省了交通灯可缩放操作中的大量计算资源。