As an emerging technology, Connected Autonomous Vehicles (CAVs) are believed to have the ability to move through intersections in a faster and safer manner, through effective Vehicle-to-Everything (V2X) communication and global observation. Autonomous intersection management is a key path to efficient crossing at intersections, which reduces unnecessary slowdowns and stops through adaptive decision process of each CAV, enabling fuller utilization of the intersection space. Distributed reinforcement learning (DRL) offers a flexible, end-to-end model for AIM, adapting for many intersection scenarios. While DRL is prone to collisions as the actions of multiple sides in the complicated interactions are sampled from a generic policy, restricting the application of DRL in realistic scenario. To address this, we propose a hierarchical RL framework where models at different levels vary in receptive scope, action step length, and feedback period of reward. The upper layer model accelerate CAVs to prevent them from being clashed, while the lower layer model adjust the trends from upper layer model to avoid the change of mobile state causing new conflicts. And the real action of CAV at each step is co-determined by the trends from both levels, forming a real-time balance in the adversarial process. The proposed model is proven effective in the experiment undertaken in a complicated intersection with 4 branches and 4 lanes each branch, and show better performance compared with baselines.
翻译:作为新兴技术,人们相信连接自治车辆(CAV)有能力通过有效的车辆对一切的通信和全球观测,以更快、更安全的方式穿越交叉路口,通过有效的车辆对一切的通信和全球观察,自主交叉管理是交叉路口高效过境的关键途径,通过每个CAV的适应性决策程序,减少不必要的减速和停止,从而更充分地利用交叉空间。分配式强化学习(DRL)为AIM提供了一个灵活、端对端的模式,适应许多交叉情景。虽然DRL容易发生碰撞,因为复杂互动中多方的行动从通用政策中抽样,限制DRL在现实情景中的应用。为了解决这个问题,我们提议了一个分级RL框架,在不同级别上的模式在可接受性范围、行动步骤长度和反馈期间各不相同,可以减少不必要的减速。 上层模式加速了CAVAV的碰撞,同时下层模式调整了上层模式的趋势,避免了造成新的冲突。而每步步段的实际行动都是从通用政策中抽选取的,每个模式的实际行动都是从不同的相互交错的角度,相互交错之间形成了一个比较的进度。