The widespread application of wireless communication technology has promoted the development of smart agriculture, where unmanned aerial vehicles (UAVs) play a multifunctional role. We target a multi-UAV smart agriculture system where UAVs cooperatively perform data collection, image acquisition, and communication tasks. In this context, we model a Markov decision process to solve the multi-UAV trajectory planning problem. Moreover, we propose a novel Elite Imitation Actor-Shared Ensemble Critic (EIA-SEC) framework, where agents adaptively learn from the elite agent to reduce trial-and-error costs, and a shared ensemble critic collaborates with each agent's local critic to ensure unbiased objective value estimates and prevent overestimation. Experimental results demonstrate that EIA-SEC outperforms state-of-the-art baselines in terms of reward performance, training stability, and convergence speed.
翻译:无线通信技术的广泛应用推动了智慧农业的发展,其中无人机发挥着多功能作用。本文研究一个多无人机智慧农业系统,其中无人机协同执行数据收集、图像采集和通信任务。在此背景下,我们建立了一个马尔可夫决策过程模型以解决多无人机轨迹规划问题。此外,我们提出了一种新颖的精英模仿行动者-共享集成评论家框架,其中智能体自适应地向精英智能体学习以降低试错成本,同时共享集成评论家与每个智能体的本地评论家协作,以确保目标价值估计的无偏性并防止过高估计。实验结果表明,在奖励性能、训练稳定性和收敛速度方面,EIA-SEC均优于现有先进基线方法。