Radio access network (RAN) slicing is a key technology that enables 5G network to support heterogeneous requirements of generic services, namely ultra-reliable low-latency communication (URLLC) and enhanced mobile broadband (eMBB). In this paper, we propose a two time-scales RAN slicing mechanism to optimize the performance of URLLC and eMBB services. In a large time-scale, an SDN controller allocates radio resources to gNodeBs according to the requirements of the eMBB and URLLC services. In a short time-scale, each gNodeB allocates its available resources to its end-users and requests, if needed, additional resources from adjacent gNodeBs. We formulate this problem as a non-linear binary program and prove its NP-hardness. Next, for each time-scale, we model the problem as a Markov decision process (MDP), where the large-time scale is modeled as a single agent MDP whereas the shorter time-scale is modeled as a multi-agent MDP. We leverage the exponential-weight algorithm for exploration and exploitation (EXP3) to solve the single-agent MDP of the large time-scale MDP and the multi-agent deep Q-learning (DQL) algorithm to solve the multi-agent MDP of the short time-scale resource allocation. Extensive simulations show that our approach is efficient under different network parameters configuration and it outperforms recent benchmark solutions.
翻译:无线电存取网络(RAN)切片是一项关键技术,使5G网络能够支持通用服务的不同要求,即超可靠的低延迟通信(URLLC)和增强的移动宽带(EMBBB)。在本文件中,我们提议了两个时间尺度的RAN切片机制,以优化URLLC和eMBB服务的性能。在大规模的时间尺度中,SDN控制器根据eMBB和URLLC服务的要求将无线电资源分配给gNodeBs。在较短的时间尺度中,每个gNodeB都将其现有资源分配给其终端用户,如果需要,则要求附近GNodeBs提供额外资源。我们将此问题作为非线性双进制的双进制程序,并证明其NPP-硬性。接下来,我们将这一问题作为Markov 决策过程(MDP) 的大规模时间尺度模拟成单一代理商MDP,而较短的时间尺度则作为多代理商MDP的模型。我们利用了探索和开发的高级时程(EXPD)最新定序算方法,在MDP的多级MDP(EX-L3)下,以演示级MDP的多级MD(MD-ro-roal-role-role-rolex)的多级计算。