高流动性毫米机动车辆网络最佳波束协会:轻量轻平行强化学习方法 (Optimal Beam Association for High Mobility mmWave Vehicular Networks: Lightweight Parallel Reinforcement Learning Approach)

In intelligent transportation systems (ITS), vehicles are expected to feature with advanced applications and services which demand ultra-high data rates and low-latency communications. For that, the millimeter wave (mmWave) communication has been emerging as a very promising solution. However, incorporating the mmWave into ITS is particularly challenging due to the high mobility of vehicles and the inherent sensitivity of mmWave beams to dynamic blockages. This article addresses these problems by developing an optimal beam association framework for mmWave vehicular networks under high mobility. Specifically, we use the semi-Markov decision process to capture the dynamics and uncertainty of the environment. The Q-learning algorithm is then often used to find the optimal policy. However, Q-learning is notorious for its slow-convergence. Instead of adopting deep reinforcement learning structures (like most works in the literature), we leverage the fact that there are usually multiple vehicles on the road to speed up the learning process. To that end, we develop a lightweight yet very effective parallel Q-learning algorithm to quickly obtain the optimal policy by simultaneously learning from various vehicles. Extensive simulations demonstrate that our proposed solution can increase the data rate by 47% and reduce the disconnection probability by 29% compared to other solutions.

翻译：在智能运输系统(ITS)中,预计车辆将具有先进的应用和服务,要求超高数据率和低纬度通信。为此,毫米波(mmWave)通信已经成为一个非常有希望的解决方案。然而,将毫米瓦夫(mmWave)通信纳入ITIS尤其具有挑战性,因为车辆流动性高,而且毫米Wave光束对动态阻塞具有内在敏感性。这一条通过为高度机动的mmWave车辆网络开发一个最佳的波音联系框架来解决这些问题。具体地说,我们利用半马尔科夫决定程序来捕捉到环境的动态和不确定性。然后,Q学习算法常常被用来寻找最佳的政策。然而,Q学习由于速度慢而臭名昭著。我们没有采用深度强化学习结构(与文献中的大多数作品一样),而是利用道路上通常有多种车辆来加快学习进程这一事实。为此,我们开发了一种轻而有效的平行的Q-学习算法,以便通过同时学习各种车辆来减少最佳政策。广泛的模拟表明,我们提议的47 %的解决方案可以通过其他的折换率提高数据率。