利用强化学习实现空中观察目标本地化 (Aerial View Goal Localization with Reinforcement Learning)

With an increased amount and availability of unmanned aerial vehicles (UAVs) and other remote sensing devices (e.g. satellites), we have recently seen a vast increase in computer vision methods for aerial view data. One application of such technologies is within search-and-rescue (SAR), where the task is to localize and assist one or several people who are missing, for example after a natural disaster. In many cases the rough location may be known and a UAV can be deployed to explore a given, confined area to precisely localize the missing people. Due to time and battery constraints it is often critical that localization is performed as efficiently as possible. In this work, we approach this type of problem by abstracting it as an aerial view goal localization task in a framework that emulates a SAR-like setup without requiring access to actual UAVs. In this framework, an agent operates on top of an aerial image (proxy for a search area) and is tasked with localizing a goal that is described in terms of visual cues. To further mimic the situation on an actual UAV, the agent is not able to observe the search area in its entirety, not even at low resolution, and thus it has to operate solely based on partial glimpses when navigating towards the goal. To tackle this task, we propose AiRLoc, a reinforcement learning (RL)-based model that decouples exploration (searching for distant goals) and exploitation (localizing nearby goals). Extensive evaluations show that AiRLoc outperforms heuristic search methods as well as alternative learnable approaches. We also conduct a proof-of-concept study which indicates that the learnable methods outperform humans on average. Code has been made publicly available: https://github.com/aleksispi/airloc.

翻译：随着无人驾驶飞行器(无人驾驶飞行器)和其他遥感装置(如卫星)的数量和可用性增加,我们最近看到,对空中观察数据而言,计算机视觉方法的计算机视野方法大大增加了。这种技术的一种应用是在搜索和救援(SAR)中应用的,在搜索和救援(SAR)中应用这种技术,任务是将一名或数名失踪的人(例如自然灾害发生后)本地化和给予他们帮助。在许多情况下,可能会知道粗糙的位置,并且可以部署无人驾驶飞行器来探索一个特定区域,但仅限于将失踪人口精确定位。由于时间和电池的限制,地方化工作效率越高,这往往至关重要。在这项工作中,我们通过将这种类型的技术作为空中观察目标定位任务,在不需要实际使用无人驾驶飞行器的情况下,在这种框架中,一个代理在空中图像之上运作(对搜索区域来说,对搜索/电池系统进行精确度评估),并且将一个基于视觉信号的平均目标定位(我们一直在学习一种可选的方法,对于实际的搜索和电池进行进一步模拟,该代理人无法观察远距搜索区域,因此在搜索区域上显示其深度目标的视野上,对轨道进行完全的升级。