AirFed：面向多无人机协同移动边缘计算的联邦图增强多智能体强化学习框架 (AirFed: Federated Graph-Enhanced Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Mobile Edge Computing)

Multiple Unmanned Aerial Vehicles (UAVs) cooperative Mobile Edge Computing (MEC) systems face critical challenges in coordinating trajectory planning, task offloading, and resource allocation while ensuring Quality of Service (QoS) under dynamic and uncertain environments. Existing approaches suffer from limited scalability, slow convergence, and inefficient knowledge sharing among UAVs, particularly when handling large-scale IoT device deployments with stringent deadline constraints. This paper proposes AirFed, a novel federated graph-enhanced multi-agent reinforcement learning framework that addresses these challenges through three key innovations. First, we design dual-layer dynamic Graph Attention Networks (GATs) that explicitly model spatial-temporal dependencies among UAVs and IoT devices, capturing both service relationships and collaborative interactions within the network topology. Second, we develop a dual-Actor single-Critic architecture that jointly optimizes continuous trajectory control and discrete task offloading decisions. Third, we propose a reputation-based decentralized federated learning mechanism with gradient-sensitive adaptive quantization, enabling efficient and robust knowledge sharing across heterogeneous UAVs. Extensive experiments demonstrate that AirFed achieves 42.9% reduction in weighted cost compared to state-of-the-art baselines, attains over 99% deadline satisfaction and 94.2% IoT device coverage rate, and reduces communication overhead by 54.5%. Scalability analysis confirms robust performance across varying UAV numbers, IoT device densities, and system scales, validating AirFed's practical applicability for large-scale UAV-MEC deployments.

翻译：多无人机协同移动边缘计算系统在动态不确定环境下，需协调轨迹规划、任务卸载与资源分配并保障服务质量，面临严峻挑战。现有方法存在可扩展性有限、收敛速度慢及无人机间知识共享效率低等问题，尤其在大规模物联网设备部署且具有严格截止时间约束的场景下更为突出。本文提出AirFed，一种新颖的联邦图增强多智能体强化学习框架，通过三项关键创新应对上述挑战。首先，我们设计双层动态图注意力网络，显式建模无人机与物联网设备间的时空依赖关系，捕捉网络拓扑中的服务关联与协作交互。其次，我们开发双执行器-单评判器架构，联合优化连续轨迹控制与离散任务卸载决策。第三，我们提出基于信誉的分布式联邦学习机制，结合梯度敏感的自适应量化技术，实现异构无人机间高效鲁棒的知识共享。大量实验表明，相较于最先进基线方法，AirFed实现加权成本降低42.9%，截止时间满足率超过99%，物联网设备覆盖率达94.2%，通信开销减少54.5%。可扩展性分析证实其在无人机数量、物联网设备密度及系统规模变化下均保持稳健性能，验证了AirFed在大规模无人机移动边缘计算部署中的实际适用性。