Cooperative coordination at unsignalized road intersections, which aims to improve the driving safety and traffic throughput for connected and automated vehicles, has attracted increasing interests in recent years. However, most existing investigations either suffer from computational complexity or cannot harness the full potential of the road infrastructure. To this end, we first present a dedicated intersection coordination framework, where the involved vehicles hand over their control authorities and follow instructions from a centralized coordinator. Then a unified cooperative trajectory optimization problem will be formulated to maximize the traffic throughput while ensuring the driving safety and long-term stability of the coordination system. To address the key computational challenges in the real-world deployment, we reformulate this non-convex sequential decision problem into a model-free Markov Decision Process (MDP) and tackle it by devising a Twin Delayed Deep Deterministic Policy Gradient (TD3)-based strategy in the deep reinforcement learning (DRL) framework. Simulation and practical experiments show that the proposed strategy could achieve near-optimal performance in sub-static coordination scenarios and significantly improve the traffic throughput in the realistic continuous traffic flow. The most remarkable advantage is that our strategy could reduce the time complexity of computation to milliseconds, and is shown scalable when the road lanes increase.
翻译:在未指派的公路十字路口进行合作协调,目的是改善接通车辆和自动化车辆的驾驶安全和交通流量,近年来,这种合作协调已引起越来越多的兴趣,然而,大多数现有调查要么存在计算复杂性,要么无法充分利用公路基础设施的潜力;为此目的,我们首先提出专门的交叉协调框架,让所涉车辆交出其控制权力,并遵循中央协调员的指示;然后将制定一个统一的合作轨迹优化问题,以便在确保协调系统驾驶安全和长期稳定性的同时最大限度地提高交通流量;为了应对现实世界部署中的主要计算挑战,我们将这一非对流顺序决定问题重新纳入一个无模式的马尔科夫决策进程(MDP),并通过在深度强化学习(DRL)框架内设计一个双步不前的深入确定性政策梯度(TD3)战略加以解决。模拟和实际实验表明,拟议的战略可以在次静态协调情景中取得接近最佳的绩效,并大大改善现实的持续交通流量中的交通流量。最显著的优势是,当我们的战略能够降低道路的复杂度时,我们的战略可以降低速度到毫秒。