Unmanned Aerial Vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points-of-failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and may experience interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). We aim to address research gaps that focus on optimising the system's EE using a 2D trajectory optimisation of UAVs serving only static ground users, and neglect the impact of interference from nearby UAV cells. Unlike previous work that assume global spatial knowledge of ground users' location via a central controller that periodically scans the network perimeter and provides real-time updates to the UAVs for decision making, we focus on a realistic decentralised approach suitable in emergencies. Thus, we apply a decentralised Multi-Agent Reinforcement Learning (MARL) approach that maximizes the system's EE by jointly optimising each UAV's 3D trajectory, number of connected static and mobile users, and the energy consumed, while taking into account the impact of interference and the UAVs' coordination on the system's EE in a dynamic network environment. To address this, we propose a direct collaborative Communication-enabled Multi-Agent Decentralised Double Deep Q-Network (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share knowledge by communicating with its nearest neighbours based on existing 3GPP guidelines. Our approach is able to maximise the system's EE without degrading the coverage performance in the network. Simulation results show that the proposed approach outperforms existing baselines in term of maximising the systems' EE by about 15% - 85%.
翻译:无人驾驶航空飞行器(UAVs)越来越多地被部署,以便在网络需求增加或现有地面蜂窝基础设施出现故障时,向静态和移动地面用户提供无线连接;然而,无人驾驶飞行器受到能源限制,并可能受到附近共享相同频率频谱的UAV细胞的干扰,从而影响系统的能效(EE)。我们的目标是解决研究差距,重点是利用仅供静态地面用户使用的2D轨迹优化使系统EEEE优化,忽视附近UAV细胞干扰的影响。与以前通过中央控制器对地面用户的定位进行全球空间知识的工作不同,中央控制器定期扫描网络周边,并为UAVs提供实时更新,供决策之用。因此,我们采用了分散的多功能强化学习(MARL)方法,通过联合优化每个UAV公司现有3D的轨迹、连接的静态和移动用户以及能源消耗的深度数据,同时顾及了我们当前CAVA的动态干涉和核心通信系统的最新效果。