Path planning methods for autonomous unmanned aerial vehicles (UAVs) are typically designed for one specific type of mission. This work presents a method for autonomous UAV path planning based on deep reinforcement learning (DRL) that can be applied to a wide range of mission scenarios. Specifically, we compare coverage path planning (CPP), where the UAV's goal is to survey an area of interest to data harvesting (DH), where the UAV collects data from distributed Internet of Things (IoT) sensor devices. By exploiting structured map information of the environment, we train double deep Q-networks (DDQNs) with identical architectures on both distinctly different mission scenarios to make movement decisions that balance the respective mission goal with navigation constraints. By introducing a novel approach exploiting a compressed global map of the environment combined with a cropped but uncompressed local map showing the vicinity of the UAV agent, we demonstrate that the proposed method can efficiently scale to large environments. We also extend previous results for generalizing control policies that require no retraining when scenario parameters change and offer a detailed analysis of crucial map processing parameters' effects on path planning performance.
翻译:自主无人驾驶飞行器(无人驾驶飞行器)的路径规划方法一般是为特定类型的飞行任务设计的。这项工作提供了一种基于深度强化学习(DRL)的自主无人驾驶飞行器路径规划方法,可用于广泛的飞行任务情景。具体地说,我们比较了覆盖路径规划(CPP),UAV的目标是勘测数据采集感兴趣的领域(DH),UAV从分布式的物联网传感器设备收集数据。我们利用环境结构化的地图信息,在两种截然不同的任务情景上培训具有相同结构的双深Q网络(DDQNs),以做出平衡各自任务目标与航行限制的移动决定。我们采用新颖的方法,利用压缩的全球环境地图,加上显示无人驾驶飞行器附近位置的固定但未压缩的本地地图,我们证明拟议的方法能够有效地推广到大环境。我们还将以往在情景参数变化时不需要再培训的普及控制政策的结果推广到对路径规划绩效的关键地图处理参数进行详细分析。