The need to satisfy the QoS requirements of multiple network slices deployed at the same base station poses a major challenge to network operators. The problem becomes even harder when the desired QoS involves packet delays. In that case, network utility maximization is not directly applicable since the utilities of the slices are unknown. As a result, most related works learn online the utilities of all slices and how to split the resources among them. Unfortunately, this approach does not scale well for many slices. Instead, it is needed to perform learning separately for each slice. To this end, we develop a bandwidth demand estimator; a network function that periodically receives as input the traffic of the slice and outputs the amount of bandwidth that its MAC scheduler needs to deliver the desired QoS. We develop the bandwidth demand estimator for QoS involving packet delay metrics based on a model-based reinforcement learning algorithm. We implement the algorithm on a cellular testbed and conduct experiments with time-varying traffic loads. Results show that the algorithm delivers the desired QoS but with significantly less bandwidth than non-adaptive approaches and other baseline online learning algorithms.
翻译:暂无翻译