Network slicing is a critical technique for 5G communications that covers radio access network (RAN), edge, transport and core slicing.The evolving network architecture requires the orchestration of multiple network resources such as radio and cache resources. In recent years, machine learning (ML) techniques have been widely applied for network management. However, most existing works do not take advantage of the knowledge transfer capability in ML. In this paper, we propose a deep transfer reinforcement learning (DTRL) scheme for joint radio and cache resource allocation to serve 5G RAN slicing. We first define a hierarchical architecture for joint resource allocation. Then we propose two DTRL algorithms: Q-value-based deep transfer reinforcement learning (QDTRL) and action selection-based deep transfer reinforcement learning (ADTRL). In the proposed schemes, learner agents utilize expert agents' knowledge to improve their performance on current tasks. The proposed algorithms are compared with both the model-free exploration bonus deep Q-learning (EB-DQN) and the model-based priority proportional fairness and time-to-live (PPF-TTL) algorithms. Compared with EB-DQN, our proposed DTRL-based method presents 21.4% lower delay for Ultra Reliable Low Latency Communications (URLLC) slice and 22.4% higher throughput for enhanced Mobile Broad Band (eMBB) slice, while achieving significantly faster convergence than EB-DQN. Moreover, 40.8% lower URLLC delay and 59.8% higher eMBB throughput are observed with respect to PPF-TTL.
翻译:网络剪切是5G通信的关键技术,包括无线电接入网络(RAN)、边缘、运输和核心剪切。 正在演变的网络结构要求协调无线电和缓存资源等多种网络资源。 近年来,机器学习(ML)技术被广泛应用于网络管理。 然而,大多数现有工程没有利用ML的知识转让能力。 在本文件中,我们提议了一个深度传输强化学习(DTRL)计划,用于联合无线电和缓存资源分配,为5GRN的断层服务。我们首先确定了联合资源分配的等级结构。 然后,我们提出了两个DTRL算法:基于Q价值的深度传输强化学习(QDTRL)和基于行动的选择深度传输强化学习(ADTRL)。在拟议的方案中,学习者利用专家的知识来提高当前任务的业绩。 拟议的算法与无模型勘探奖金的深层Q学习(EB-DQN平流)和基于模型的更高级的平价公平性和时间定位(PPF-TL)之间的结构。 与EF-LMUL-L-L-L-C的大幅升级方法相比,在21年的透明-LLLLLLULLLLLULUL值中,在达到更高通信中的升级。