为移动网络的终端至终端资源配置进行约束性软件深强化学习 (Constraint-Aware Deep Reinforcement Learning for End-to-End Resource Orchestration in Mobile Networks)

Network slicing is a promising technology that allows mobile network operators to efficiently serve various emerging use cases in 5G. It is challenging to optimize the utilization of network infrastructures while guaranteeing the performance of network slices according to service level agreements (SLAs). To solve this problem, we propose SafeSlicing that introduces a new constraint-aware deep reinforcement learning (CaDRL) algorithm to learn the optimal resource orchestration policy within two steps, i.e., offline training in a simulated environment and online learning with the real network system. On optimizing the resource orchestration, we incorporate the constraints on the statistical performance of slices in the reward function using Lagrangian multipliers, and solve the Lagrangian relaxed problem via a policy network. To satisfy the constraints on the system capacity, we design a constraint network to map the latent actions generated from the policy network to the orchestration actions such that the total resources allocated to network slices do not exceed the system capacity. We prototype SafeSlicing on an end-to-end testbed developed by using OpenAirInterface LTE, OpenDayLight-based SDN, and CUDA GPU computing platform. The experimental results show that SafeSlicing reduces more than 20% resource usage while meeting SLAs of network slices as compared with other solutions.

翻译：网络断层是一种很有希望的技术,它使移动网络操作员能够高效率地为5G中各种新出现的使用案例服务。优化网络基础设施的利用,同时根据服务级协议保证网络切片的性能是一项挑战。为了解决这个问题,我们提议安全切片,引入一种新的限制意识深度强化学习(CadRL)算法,在两个步骤中学习最佳资源调控政策,即模拟环境中的离线培训,与真正的网络系统进行在线学习。在优化资源调控时,我们纳入了对利用Lagrangian乘数奖励功能中切片的统计性能的限制,并通过政策网络解决Lagrangian放松的问题。为了满足系统能力方面的限制,我们设计了一个限制网络,将政策网络产生的潜在行动映射到调控行动中,使分配给网络切片的总资源不超过系统的能力。我们在使用OpenAir Interface LTE、OpenDayLight-SDN和CUDAP 资源定位平台比SAL-LUDA更多的实验结果,同时将SAL-LYS-LAPILA 系统比S的资源计算平台显示更多的实验性结果。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【斯坦福大学课程】2021年深度多任务学习与元学习，CS 330: Deep Multi-Task and Meta Learning

专知会员服务

110+阅读 · 2022年3月2日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日