We study the problem of routing and scheduling of real-time flows over a multi-hop millimeter wave (mmWave) mesh. We develop a model-free deep reinforcement learning algorithm that determines which subset of the mmWave links should be activated during each time slot and using what power level. The proposed algorithm, called Adaptive Activator RL (AARL), can handle a variety of network topologies, network loads, and interference models, as well as adapt to different workloads. We demonstrate the operation of AARL on several topologies: a small topology with 10 links, a moderately-sized mesh with 48 links, and a large topology with 96 links. For each topology, the results of AARL are compared to those of a greedy scheduling algorithm. AARL is shown to outperform the greedy algorithm in two aspects. First, its schedule obtains higher goodput. Second, and even more importantly, while the run time of the greedy algorithm renders it impractical for real-time scheduling, the run time of AARL is suitable for meeting the time constraints of typical 5G networks.
翻译:我们研究多光速波(mmWave)网状实时流动的路径和时间安排问题。我们开发了一个无模型的深强化学习算法,确定每个时段和使用何种功率水平,每个时段应激活毫米Wave连接的哪个子集。拟议的算法叫做适应性活性机(AARL),可以处理各种网络地形、网络负荷和干扰模型,并适应不同的工作量。我们展示了AARL在几个表层上的操作:一个有10个链接的小表层,一个有48个链接的中度网块,以及一个有96个链接的大型表层。对于每个表层而言,AARL的结果都与贪婪的排程算法相比较。AARL在两个方面都显示它比贪婪的算法要优于两个方面。首先,它的时间表得到了更高的好性。第二,更重要的是,贪婪算法的运行时间使得它无法实时排期,而AARL的运行时间适合于满足典型的5G网络的时间限制。