Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system in dynamic and complicated driving scenarios. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel Safety Shield for CAVs in challenging driving scenarios that includes unconnected hazard vehicles. The coordination mechanisms of the proposed MARL include information sharing and cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer as a spatial-temporal encoder that enhances the agent's environment awareness. The Safety Shield module with Control Barrier Functions (CBF)-based safety checking protects the agents from taking unsafe actions. We design a constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe and cooperative policies for CAVs. With the experiment deployed in the CARLA simulator, we verify the performance of the safety checking, spatial-temporal encoder, and coordination mechanisms designed in our method by comparative experiments in several challenging scenarios with unconnected hazard vehicles. Results show that our proposed methodology significantly increases system safety and efficiency in challenging scenarios.
翻译:在这项工作中,我们提出了一个限制多剂强化学习(MARL)框架,同时为CAV提供一种安全盾牌,用于挑战包括无连通危险车辆在内的驾驶方案的安全与合作政策。拟议的MARL的协调机制包括信息共享和合作政策学习,与图表革命网络(GCN)-转换为空间时序编码器,以提高代理人的环境意识。基于控制屏障的安全检查安全盾牌模块保护代理人不采取不安全行动。我们设计了一个限制多剂优势的多剂行为者――克隆(CMAA2C)算法,为CAVS制定安全与合作政策。在CARLA模拟器中进行的试验,我们通过在与不连通的车辆的几种富有挑战性假设中进行比较试验,核查安全检查、空间时序编码器和我们方法中设计的协调机制的性能。结果显示,我们拟议的方法大大提高了安全和效率。</s>