This paper studies the allocation of shared resources between vehicle-to-infrastructure (V2I) and vehicle-to-vehicle (V2V) links in vehicle-to-everything (V2X) communications. In existing algorithms, dynamic vehicular environments and quantization of continuous power become the bottlenecks for providing an effective and timely resource allocation policy. In this paper, we develop two algorithms to deal with these difficulties. First, we propose a deep reinforcement learning (DRL)-based resource allocation algorithm to improve the performance of both V2I and V2V links. Specifically, the algorithm uses deep Q-network (DQN) to solve the sub-band assignment and deep deterministic policy-gradient (DDPG) to solve the continuous power allocation problem. Second, we propose a meta-based DRL algorithm to enhance the fast adaptability of the resource allocation policy in the dynamic environment. Numerical results demonstrate that the proposed DRL-based algorithm can significantly improve the performance compared to the DQN-based algorithm that quantizes continuous power. In addition, the proposed meta-based DRL algorithm can achieve the required fast adaptation in the new environment with limited experiences.
翻译:本文研究车辆到基础设施(V2I)和车辆到车辆(V2V)之间车辆到工程(V2X)通信联系的共享资源的分配问题。在现有算法中,动态车辆环境和连续功率的量化成为提供有效和及时资源分配政策的瓶颈。在本文件中,我们开发了两种算法来应对这些困难。首先,我们建议采用基于车辆到基础设施(V2I)和车辆到车辆(V2V)联系的深度强化学习(DRL)资源分配算法来改进V2I和V2V的连接的性能。具体地说,算法利用深Q网络(DQN)来解决分带任务和深度确定性政策等级(DPG)来解决连续的权力分配问题。第二,我们提出基于元的DRL算法,以提高动态环境中资源分配政策的快速适应能力。数字显示,拟议的基于DRL算法可以大大改进业绩,而基于DQN的算法则使连续功率量化。此外,拟议的基于元数据的DRL算法可以在新的环境中实现必要的快速适应。