Efficient aerial data collection is important in many remote sensing applications. In large-scale monitoring scenarios, deploying a team of unmanned aerial vehicles (UAVs) offers improved spatial coverage and robustness against individual failures. However, a key challenge is cooperative path planning for the UAVs to efficiently achieve a joint mission goal. We propose a novel multi-agent informative path planning approach based on deep reinforcement learning for adaptive terrain monitoring scenarios using UAV teams. We introduce new network feature representations to effectively learn path planning in a 3D workspace. By leveraging a counterfactual baseline, our approach explicitly addresses credit assignment to learn cooperative behaviour. Our experimental evaluation shows improved planning performance, i.e. maps regions of interest more quickly, with respect to non-counterfactual variants. Results on synthetic and real-world data show that our approach has superior performance compared to state-of-the-art non-learning-based methods, while being transferable to varying team sizes and communication constraints.
翻译:在许多遥感应用中,高效的航空数据收集工作非常重要。在大规模监测情景中,部署无人驾驶飞行器小组(无人驾驶飞行器)可以改善空间覆盖面和抵御个人失灵的稳健性;然而,一项关键挑战是如何为无人驾驶飞行器制定合作路径规划,以有效实现联合任务目标。我们提出基于利用无人驾驶飞行器小组为适应性地形监测情景进行深入强化学习的新多试剂信息路径规划方法。我们引入新的网络特征演示,以在3D工作空间有效学习路径规划。通过利用反事实基线,我们的方法明确解决信用分配问题,以学习合作行为。我们的实验评估显示,在非现实变异方面,相关区域(即更快速地地图)的规划业绩有所改善。关于合成和现实世界数据的结果显示,我们的方法与最新非学习方法相比,业绩优于最先进的非学习方法,同时可转让到不同的团队规模和通信限制。</s>