We consider the problem of dynamic platoon leader selection, user association, channel assignment, and power allocation on a cellular vehicle-to-everything (C-V2X) based highway, where multiple vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) links share the frequency resources. There are multiple roadside units (RSUs) on a highway, and vehicles can form platoons, which has been identified as an advanced use case to increase road efficiency. The traditional optimization methods, requiring global channel information at a central controller, are not viable for high-mobility vehicular networks. To deal with this challenge, we propose a distributed multi-agent reinforcement learning (MARL) for resource allocation (RA). Each platoon leader, acting as an agent, can collaborate with other agents for joint sub-band selection and power allocation for its V2V links, and joint user association and power control for its V2I links. Moreover, each platoon can dynamically select the vehicle most suitable to be the platoon leader. We aim to maximize the V2V and V2I packet delivery probability in the desired latency using the deep Q-learning algorithm. Simulation results indicate that our proposed MARL outperforms the centralized hill-climbing algorithm, and platoon leader selection helps to improve both V2V and V2I performance.
翻译:我们考虑在基于蜂窝车辆向一切(C-V2X)的高速公路上动态选择编队领袖、用户关联、信道分配和功率分配的问题,在这里,多个车对车(V2V)和车对基础设施(V2I)链接共享频率资源。高速公路上有多个路边单元(RSU),车辆可以形成编队,这已被确定为增加道路效率的先进用例。传统的优化方法需要中央控制器在全局信道信息下运行,这在高移动性车载网络中是不可行的。为了应对这一挑战,我们提出了一种分布式多智能体强化学习(MARL)用于资源分配(RA)。每个编队领袖作为一个代理可以与其他代理协作,为其V2V链路进行联合子带选择和功率分配以及其V2I链路的联合用户关联和功率控制。此外,每个编队可以动态选择最适合成为编队领袖的车辆。我们的目标是使用深度Q学习算法最大化所需延迟内的V2V和V2I数据包传递概率。仿真结果表明,我们提出的MARL优于集中的山地爬升算法,并且编队领袖选择有助于提高V2V和V2I的性能。