Artificial intelligence (AI)-driven zero-touch network slicing is envisaged as a promising cutting-edge technology to harness the full potential of heterogeneous 5G and beyond 5G (B5G) communication systems and enable the automation of demand-aware resource management and orchestration (MANO). In this paper, we tackle the issue of B5G radio access network (RAN) joint slice admission control and resource allocation according to proposed slice-enabling cell-free massive multiple-input multiple-output (mMIMO) setup by invoking a continuous deep reinforcement learning (DRL) method. We present a novel Actor-Critic-based network slicing approach called, prioritized twin delayed distributional deep deterministic policy gradient (D-TD3)}. The paper defines and corroborates via extensive experimental results a zero-touch network slicing scheme with a multi-objective approach where the central server learns continuously to accumulate the knowledge learned in the past to solve future problems and re-configure computing resources autonomously while minimizing latency, energy consumption, and virtual network function (VNF) instantiation cost for each slice. Moreover, we pursue a state-action return distribution learning approach with the proposed replay policy and reward-penalty mechanisms. Finally, we present numerical results to showcase the gain of the adopted multi-objective strategy and verify the performance in terms of achieved slice admission rate, latency, energy, CPU utilization, and time efficiency.
翻译:人工智能(AI)驱动的零触摸网络切片断层被设想为一种大有希望的尖端技术,它能充分发挥5G和5G(B5G)以外的各种通信系统的潜力,实现需求意识资源管理和交响(MANO)的自动化。 在本文中,我们根据拟议的切片扶持型无细胞的大规模多投入多产出多产出产出(MIMO)的设置,处理B5G无线电接入网络(RAN)联合切片控制和资源配置问题,通过不断强化学习(DRL)的方法,建立这种技术很有希望。我们提出了一个新型的基于C5G(B5G)型和5G(B5G)型以上各种通信系统的潜在潜力,使需求意识型资源管理和调控(MANO)系统得以实现。 本文通过广泛的实验结果界定和证实了B5G无线电接入网络(RAN)联合切片接入网络(RAN)联合接收控制和资源配置问题。 中央服务器不断学习积累过去学到的知识,以解决未来问题,并自主计算资源,同时尽量减少拉特、能源消耗和虚拟网络(VNF)快速利用网络的网络利用使用方法,我们提出了分流平流化(DD3)双分流化的分流化,对每个分流成本分配的升级政策最后分配。 文件定义成本。