Radio access network (RAN) slicing is an important pillar in cross-domain network slicing which covers RAN, edge, transport and core slicing. The evolving network architecture requires the orchestration of multiple network resources such as radio and cache resources. In recent years, machine learning (ML) techniques have been widely applied for network management. However, most existing works do not take advantage of the knowledge transfer capability in ML. In this paper, we propose a deep transfer reinforcement learning (DTRL) scheme for joint radio and cache resource allocation to serve 5G RAN slicing. We first define a hierarchical architecture for the joint resource allocation. Then we propose two DTRL algorithms: Q-value-based deep transfer reinforcement learning (QDTRL) and action selection-based deep transfer reinforcement learning (ADTRL). In the proposed schemes, learner agents utilize expert agents' knowledge to improve their performance on target tasks. The proposed algorithms are compared with both the model-free exploration bonus deep Q-learning (EB-DQN) and the model-based priority proportional fairness and time-to-live (PPF-TTL) algorithms. Compared with EB-DQN, our proposed DTRL based method presents 21.4% lower delay for Ultra Reliable Low Latency Communications (URLLC) slice and 22.4% higher throughput for enhanced Mobile Broad Band (eMBB) slice, while achieving significantly faster convergence than EB-DQN. Moreover, 40.8% lower URLLC delay and 59.8% higher eMBB throughput are observed with respect to PPF-TTL.
翻译:正在演变的网络架构要求调和多个网络资源,例如无线电和缓存资源。近年来,机器学习(ML)技术被广泛应用于网络管理。然而,大多数现有工程没有利用ML的知识转让能力。在本文件中,我们提议了一个深度传输强化学习(DTRL)计划,用于联合无线电和缓存网络的分流,以服务5G RAN、边缘、运输和核心切片。我们首先为联合资源分配确定一个等级结构。然后我们提议两个DTRL算法:基于Q值的深度传输增强学习(QDTRL)和基于行动选择的深度传输强化学习(ADTRL)。在拟议的计划中,学习者利用专家人员的知识来提高他们在目标任务中的绩效。拟议的算法与无模型勘探红利的深度QULULSU(E-DQN)和基于模型的优先比例和时间比值(PPF-TLLLL)比较低的电子-时间比值(EPF-TLLLL) 和基于电子-NLLLB的低级的升级方法,以大幅实现EPF-TLLULLLLLLLLLL 和低级的升级。