深入强化Q-学习智能交通信号控制,并进行部分探测 (Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial Detection)

Intelligent traffic signal controllers, applying DQN algorithms to traffic light policy optimization, efficiently reduce traffic congestion by adjusting traffic signals to real-time traffic. Most propositions in the literature however consider that all vehicles at the intersection are detected, an unrealistic scenario. Recently, new wireless communication technologies have enabled cost-efficient detection of connected vehicles by infrastructures. With only a small fraction of the total fleet currently equipped, methods able to perform under low detection rates are desirable. In this paper, we propose a deep reinforcement Q-learning model to optimize traffic signal control at an isolated intersection, in a partially observable environment with connected vehicles. First, we present the novel DQN model within the RL framework. We introduce a new state representation for partially observable environments and a new reward function for traffic signal control, and provide a network architecture and tuned hyper-parameters. Second, we evaluate the performances of the model in numerical simulations on multiple scenarios, in two steps. At first in full detection against existing actuated controllers, then in partial detection with loss estimates for proportions of connected vehicles. Finally, from the obtained results, we define thresholds for detection rates with acceptable and optimal performance levels.

翻译：智能交通信号控制器,将DQN算法应用于交通灯政策优化,通过调整交通信号以适应实时交通,有效减少交通堵塞。文献中的大多数主张认为,在十字路口发现所有车辆都是不切实际的情况。最近,新的无线通信技术使得能够通过基础设施以具有成本效益的方式探测连接的车辆。目前只有机队的一小部分设备,因此,在低探测率下开展工作的方法是可取的。在本文件中,我们提议了一个深度强化Q学习模式,以便在与相关车辆有部分可观察的环境中,在一个孤立的十字路口优化交通信号控制。首先,我们在RL框架内提出了新型的DQN模式。我们为部分观测环境引入了新的状态代表器,为交通信号控制引入了新的奖励功能,提供了网络架构和调整的超参数。第二,我们评估了多种情景数字模拟模型的性能,分两个步骤。首先对现有受控控制器进行全面检测,然后对与相关车辆损失比例进行部分检测。最后,我们从获得的结果中确定了检测率的可接受度和最佳性能水平的临界值。