Aerial robots are increasingly being utilized for a wide range of environmental monitoring and exploration tasks. However, a key challenge is efficiently planning paths to maximize the information value of acquired data as an initially unknown environment is explored. To address this, we propose a new approach for informative path planning (IPP) based on deep reinforcement learning (RL). Bridging the gap between recent advances in RL and robotic applications, our method combines Monte Carlo tree search with an offline-learned neural network predicting informative sensing actions. We introduce several components making our approach applicable for robotic tasks with continuous high-dimensional state spaces and large action spaces. By deploying the trained network during a mission, our method enables sample-efficient online replanning on physical platforms with limited computational resources. Evaluations using synthetic data show that our approach performs on par with existing information-gathering methods while reducing runtime by a factor of 8-10. We validate the performance of our framework using real-world surface temperature data from a crop field.
翻译:航空机器人正越来越多地被用于广泛的环境监测和勘探任务。然而,一个关键的挑战是如何有效地规划途径,以便在探索最初未知的环境时,使获得的数据的信息价值最大化。为了解决这个问题,我们提议基于深层强化学习(RL)的信息化路径规划(IPP)新方法。我们的方法将蒙特卡洛树搜索与预测信息遥感行动的离线神经网络结合起来。我们引入了几个组成部分,使我们的方法适用于连续高维状态空间和大型行动空间的机器人任务。我们的方法通过在任务期间部署经过培训的网络,使得能够以有限的计算资源对物理平台进行抽样高效的在线再规划。使用合成数据进行的评估表明,我们的方法与现有信息收集方法相同,同时将运行时间缩短到8-10系数。我们用作物田的实际地球表面温度数据验证了我们框架的绩效。