Artificial intelligence (AI)-driven zero-touch network slicing (NS) is a new paradigm enabling the automation of resource management and orchestration (MANO) in multi-tenant beyond 5G (B5G) networks. In this paper, we tackle the problem of cloud-RAN (C-RAN) joint slice admission control and resource allocation by first formulating it as a Markov decision process (MDP). We then invoke an advanced continuous deep reinforcement learning (DRL) method called twin delayed deep deterministic policy gradient (TD3) to solve it. In this intent, we introduce a multi-objective approach to make the central unit (CU) learn how to re-configure computing resources autonomously while minimizing latency, energy consumption and virtual network function (VNF) instantiation cost for each slice. Moreover, we build a complete 5G C-RAN network slicing environment using OpenAI Gym toolkit where, thanks to its standardized interface, it can be easily tested with different DRL schemes. Finally, we present extensive experimental results to showcase the gain of TD3 as well as the adopted multi-objective strategy in terms of achieved slice admission success rate, latency, energy saving and CPU utilization.
翻译:人工智能(AI)驱动的零触摸网络切片(NS)是一种新的范例,使资源管理和管弦(MANO)在5G(B5G)网络以外的多租户网络中实现自动化。在本文件中,我们处理云-RAN(C-RAN)联合切片接收控制和资源分配问题,首先将它设计成一个Markov决策程序(MDP ) 。然后我们援引一种称为双延迟的深层确定性政策梯度(TD3)的高级不断强化学习(DRL)方法来解决它。我们为此采用了一种多目标方法,使中央单位(CU)学会如何自主地重新配置计算资源,同时最大限度地减少每个切片的延迟、能源消耗和虚拟网络功能(VNF)的即时化成本。此外,我们利用OpenAI Gym工具包构建了一个完整的5G C-RAN网络剪切环境,在那里,由于其标准化的界面,它很容易与不同的DL计划进行测试。最后,我们提出了广泛的实验结果,以展示TD3的收益以及所通过的多目标率战略,即已实现的切入成功率。