Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel safety shield for CAVs in challenging driving scenarios. The coordination mechanisms of the proposed MARL include information sharing and cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer as a spatial-temporal encoder that enhances the agent's environment awareness. The safety shield module with Control Barrier Functions (CBF)-based safety checking protects the agents from taking unsafe actions. We design a constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe and cooperative policies for CAVs. With the experiment deployed in the CARLA simulator, we verify the effectiveness of the safety checking, spatial-temporal encoder, and coordination mechanisms designed in our method by comparative experiments in several challenging scenarios with the defined hazard vehicles (HAZV). Results show that our proposed methodology significantly increases system safety and efficiency in challenging scenarios.
翻译:在这项工作中,我们提出了一个限制多剂强化学习(MARL)框架,为CAV提供一种平行的安全屏蔽,为CAV提供在具有挑战性驾驶场景中的安全和合作政策培训。拟议的MARL的协调机制包括信息共享和合作政策学习,与图形革命网络(GCN)-转换为提高代理商环境意识的空间时空编码器。基于控制屏障的安全屏蔽模块(CBF)保护了代理商不采取不安全行动。我们设计了一种限制多剂优势的演员-crictic(CMAA2C)算法,以培训CAVA安全与合作政策。在CARLA模拟器中进行的实验,我们通过对确定的危险飞行器(HAZV)的几种具有挑战性的情形进行比较实验,来核查安全检查、空间时空编码以及我们方法中设计的协调机制的有效性。结果显示,我们提出的方法大大提高了系统安全性和挑战性设想方案的效率。