Network slicing is a promising technology that allows mobile network operators to efficiently serve various emerging use cases in 5G. It is challenging to optimize the utilization of network infrastructures while guaranteeing the performance of network slices according to service level agreements (SLAs). To solve this problem, we propose SafeSlicing that introduces a new constraint-aware deep reinforcement learning (CaDRL) algorithm to learn the optimal resource orchestration policy within two steps, i.e., offline training in a simulated environment and online learning with the real network system. On optimizing the resource orchestration, we incorporate the constraints on the statistical performance of slices in the reward function using Lagrangian multipliers, and solve the Lagrangian relaxed problem via a policy network. To satisfy the constraints on the system capacity, we design a constraint network to map the latent actions generated from the policy network to the orchestration actions such that the total resources allocated to network slices do not exceed the system capacity. We prototype SafeSlicing on an end-to-end testbed developed by using OpenAirInterface LTE, OpenDayLight-based SDN, and CUDA GPU computing platform. The experimental results show that SafeSlicing reduces more than 20% resource usage while meeting SLAs of network slices as compared with other solutions.
翻译:网络断层是一种很有希望的技术,它使移动网络操作员能够高效率地为5G中各种新出现的使用案例服务。 优化网络基础设施的利用,同时根据服务级协议保证网络切片的性能是一项挑战。 为了解决这个问题,我们提议安全切片,引入一种新的限制意识深度强化学习(CadRL)算法,在两个步骤中学习最佳资源调控政策,即模拟环境中的离线培训,与真正的网络系统进行在线学习。在优化资源调控时,我们纳入了对利用Lagrangian乘数奖励功能中切片的统计性能的限制,并通过政策网络解决Lagrangian放松的问题。为了满足系统能力方面的限制,我们设计了一个限制网络,将政策网络产生的潜在行动映射到调控行动中,使分配给网络切片的总资源不超过系统的能力。我们在使用OpenAir Interface LTE、OpenDayLight-SDN和CUDAP 资源定位平台比SAL-LUDA更多的实验结果,同时将SAL-LYS-LAPILA 系统比S的资源计算平台显示更多的实验性结果。