Cooperative coordination at unsignalized road intersections, which aims to improve the driving safety and traffic throughput for connected and automated vehicles, has attracted increasing interests in recent years. However, most existing investigations either suffer from computational complexity or cannot harness the full potential of the road infrastructure. To this end, we first present a dedicated intersection coordination framework, where the involved vehicles hand over their control authorities and follow instructions from a centralized coordinator. Then a unified cooperative trajectory optimization problem will be formulated to maximize the traffic throughput while ensuring the driving safety and long-term stability of the coordination system. To address the key computational challenges in the real-world deployment, we reformulate this non-convex sequential decision problem into a model-free Markov Decision Process (MDP) and tackle it by devising a Twin Delayed Deep Deterministic Policy Gradient (TD3)-based strategy in the deep reinforcement learning (DRL) framework. Simulation and practical experiments show that the proposed strategy could achieve near-optimal performance in sub-static coordination scenarios and significantly improve the traffic throughput in the realistic continuous traffic flow. The most remarkable advantage is that our strategy could reduce the time complexity of computation to milliseconds, and is shown scalable when the road lanes increase.
翻译:针对非信号道路交叉口实现协同管理能够提高连接和自动驾驶汽车的驾驶安全和交通流量,近年来引起了越来越多的关注。然而,现有的大部分研究在计算上或不能充分利用道路基础设施方面存在问题。为此,我们首先提出了一个专门的路口协调框架,其中涉及的车辆交出控制权,并遵循来自集中式协调员的指令。然后,我们制定了一个统一的协同轨迹优化问题,以最大化交通吞吐量,同时确保协调系统的驾驶安全和长期稳定性。为了解决实际部署中的关键计算挑战,我们将这个非凸顺序决策问题重新 转化为一个无模型马尔可夫决策过程(MDP),并在深度强化学习(DRL)框架下设计了一个双重延迟深度确定性策略梯度(TD3)的策略。模拟和实际实验表明,所提出的策略在几乎静态协调场景下可以实现接近最优的性能,并且在现实的连续交通流中显著提高交通吞吐量。最显著的优点是,我们的策略可以将计算的时间复杂度降低到毫秒级,并且在道路车道增加时具有可伸缩性。