Active target sensing is the task of discovering and classifying an unknown number of targets in an environment and is critical in search-and-rescue missions. This paper develops a deep reinforcement learning approach to plan informative trajectories that increase the likelihood for an uncrewed aerial vehicle (UAV) to discover missing targets. Our approach efficiently (1) explores the environment to discover new targets, (2) exploits its current belief of the target states and incorporates inaccurate sensor models for high-fidelity classification, and (3) generates dynamically feasible trajectories for an agile UAV by employing a motion primitive library. Extensive simulations on randomly generated environments show that our approach is more efficient in discovering and classifying targets than several other baselines. A unique characteristic of our approach, in contrast to heuristic informative path planning approaches, is that it is robust to varying amounts of deviations of the prior belief from the true target distribution, thereby alleviating the challenge of designing heuristics specific to the application conditions.
翻译:主动目标感测的任务是在环境中发现数量不明的目标并对数量不明的目标进行分类,对于搜索和救援任务至关重要。本文件开发了一种深度强化学习方法,以规划信息化轨迹,从而增加未密封航空飞行器发现缺失目标的可能性。我们的有效方法(1) 探索环境以发现新目标,(2) 利用其目前对目标国的信念,纳入不准确的传感器模型以进行高不贞分类,(3) 通过使用运动原始图书馆为灵活无人驾驶飞行器生成动态可行的轨迹。对随机生成的环境进行的广泛模拟表明,我们的方法在发现和分类目标方面比其他几个基线更有效。我们方法的一个独特特征是,与超常信息化的路径规划方法相比,我们的方法的特点是能够应对与真实目标分布不同程度的先前信念的偏差,从而减轻针对应用条件设计超常特征的挑战。