This research studies the network traffic signal control problem. It uses the Lyapunov control function to derive the back pressure method, which is equal to differential queue lengths weighted by intersection lane flows. Lyapunov control theory is a platform that unifies several current theories for intersection signal control. We further use the theorem to derive the flow-based and other pressure-based signal control algorithms. For example, the Dynamic, Optimal, Real-time Algorithm for Signals (DORAS) algorithm may be derived by defining the Lyapunov function as the sum of queue length. The study then utilizes the back pressure as a reward in the reinforcement learning (RL) based network signal control, whose agent is trained with double Deep Q-Network (Double-DQN). The proposed algorithm is compared with several traditional and RL-based methods under passenger traffic flow and mixed flow with freight traffic, respectively. The numerical tests are conducted on a single corridor and on a local grid network under three traffic demand scenarios of low, medium, and heavy traffic, respectively. The numerical simulation demonstrates that the proposed algorithm outperforms the others in terms of the average vehicle waiting time on the network.
翻译:这项研究研究网络交通信号控制问题。 它使用 Lyapunov 控制功能来得出后压法, 等于交叉车道流量加权的不同队列长度。 Lyapunov 控制理论是一个平台, 统一了当前交叉信号控制的若干理论。 我们进一步使用该理论来得出基于流量和其他基于压力的信号控制算法。 例如, 信号的动态、 最佳、 实时ALGorithm 算法( DORAS) 可以通过将 Lyapunov 函数定义为排队长度的总和来得出。 然后, 研究利用后压作为强化学习( RL) 网络信号控制的一个奖励, 其代理器受过双深Q- Network (Double- DQN) 的培训。 拟议的算法分别与客运流量和与货运混合流下的若干基于RL 的传统方法进行比较。 数字测试是在一个单一走廊和当地电网网络进行, 三种低、 中、 和 重交通的交通需求情景下进行。 数字模拟表明, 拟议的算法在平均的车辆网络上超越了其他。