Optimal resource allocation in modern communication networks calls for the optimization of objective functions that are only accessible via costly separate evaluations for each candidate solution. The conventional approach carries out the optimization of resource-allocation parameters for each system configuration, characterized, e.g., by topology and traffic statistics, using global search methods such as Bayesian optimization (BO). These methods tend to require a large number of iterations, and hence a large number of key performance indicator (KPI) evaluations. In this paper, we propose the use of meta-learning to transfer knowledge from data collected from related, but distinct, configurations in order to speed up optimization on new network configurations. Specifically, we combine meta-learning with BO, as well as with multi-armed bandit (MAB) optimization, with the latter having the potential advantage of operating directly on a discrete search space. Furthermore, we introduce novel contextual meta-BO and meta-MAB algorithms, in which transfer of knowledge across configurations occurs at the level of a mapping from graph-based contextual information to resource-allocation parameters. Experiments for the problem of open loop power control (OLPC) parameter optimization for the uplink of multi-cell multi-antenna systems provide insights into the potential benefits of meta-learning and contextual optimization.
翻译:在现代通信网络中,最佳资源分配要求优化只能通过对每个候选解决方案进行费用昂贵的单独评价才能获得的客观功能。常规方法优化了每个系统配置的资源配置参数,其特点是,利用诸如巴伊西亚优化(BO)等全球搜索方法进行地形学和交通统计。这些方法往往需要大量迭代,从而需要大量关键业绩指标(KPI)评价。在本文件中,我们提议使用元学习,从相关但不同的配置中收集的数据中转让知识,以加快新网络配置的优化。具体而言,我们将元学习与BO以及多臂土匪(MAB)优化结合起来,后者具有直接在离散搜索空间运行的潜在优势。此外,我们引入了新的背景的元BO和元MAB算算算法,在从图表背景信息到资源配置参数的绘图中将知识转移到资源配置参数。对开放回路控制(OLPC)问题进行了实验,将多臂强优化的元能控制(OLPC)参数转化为多链系统的潜在视野。