UAV 轨迹、多UAV增强能源收集通信的用户协会和电力控制:离线设计和在线强化学习 (UAV Trajectory, User Association and Power Control for Multi-UAV Enabled Energy Harvesting Communications: Offline Design and Online Reinforcement Learning)

2022 年 7 月 21 日

UAV Trajectory, User Association and Power Control for Multi-UAV Enabled Energy Harvesting Communications: Offline Design and Online Reinforcement Learning

翻译：UAV 轨迹、多UAV增强能源收集通信的用户协会和电力控制:离线设计和在线强化学习

Chien-Wei Fu,Meng-Lin Ku,Yu-Jia Chen,Tony Q. S. Quek

In this paper, we consider multiple solar-powered wireless nodes which utilize the harvested solar energy to transmit collected data to multiple unmanned aerial vehicles (UAVs) in the uplink. In this context, we jointly design UAV flight trajectories, UAV-node communication associations, and uplink power control to effectively utilize the harvested energy and manage co-channel interference within a finite time horizon. To ensure the fairness of wireless nodes, the design goal is to maximize the worst user rate. The joint design problem is highly non-convex and requires causal (future) knowledge of the instantaneous energy state information (ESI) and channel state information (CSI), which are difficult to predict in reality. To overcome these challenges, we propose an offline method based on convex optimization that only utilizes the average ESI and CSI. The problem is solved by three convex subproblems with successive convex approximation (SCA) and alternative optimization. We further design an online convex-assisted reinforcement learning (CARL) method to improve the system performance based on real-time environmental information. An idea of multi-UAV regulated flight corridors, based on the optimal offline UAV trajectories, is proposed to avoid unnecessary flight exploration by UAVs and enables us to improve the learning efficiency and system performance, as compared with the conventional reinforcement learning (RL) method. Computer simulations are used to verify the effectiveness of the proposed methods. The proposed CARL method provides 25% and 12% improvement on the worst user rate over the offline and conventional RL methods.

翻译：在本文中,我们考虑多个太阳能无线节点,这些节点利用收获的太阳能将收集的数据传输到高链路的多无人驾驶飞行器(UAVs),在这方面,我们联合设计了无人驾驶飞行器飞行轨迹、UAV-node通信协会和上链电力控制,以有效利用所收获的能源,并在有限的时间范围内管理联合通道干扰。为了确保无线节点的公平性,设计目标是最大限度地提高最差用户比率。联合设计问题是高度非电解,需要了解即时能源状态信息和频道状态信息(CSI)的因果(未来)知识,而这种知识在现实中难以预测。为了克服这些挑战,我们提议了基于配置节线优化优化的天线飞行轨迹优化方法,我们建议了一个基于普通光线信息、高路程优化的飞行效率的离线(CARL) 方法, 使多线路路路路段的飞行效率得以升级, 以最优化的飞行效率学习系统为基础, 将多路路段的飞行节路段升级, 以最优化的飞行节路段升级法。