Many problems can be viewed as forms of geospatial search aided by aerial imagery, with examples ranging from detecting poaching activity to human trafficking. We model this class of problems in a visual active search (VAS) framework, which takes as input an image of a broad area, and aims to identify as many examples of a target object as possible. It does this through a limited sequence of queries, each of which verifies whether an example is present in a given region. We propose a reinforcement learning approach for VAS that leverages a collection of fully annotated search tasks as training data to learn a search policy, and combines features of the input image with a natural representation of active search state. Additionally, we propose domain adaptation techniques to improve the policy at decision time when training data is not fully reflective of the test-time distribution of VAS tasks. Through extensive experiments on several satellite imagery datasets, we show that the proposed approach significantly outperforms several strong baselines. Code and data will be made public.
翻译:可以将许多问题视为空中图像辅助的地理空间搜索形式,其实例从发现偷猎活动到人口贩运不等。我们用视觉活跃搜索(VAS)框架来模拟这类问题,作为广泛区域图像的输入,目的是尽可能多地确定目标对象的示例。通过有限的一系列查询来做到这一点,其中每个查询都核实某一区域是否存在一个实例。我们建议对VAS采取强化学习方法,利用作为培训数据的充分附加说明的搜索任务集来学习搜索政策,并将输入图像的特征与积极搜索状态的自然表现结合起来。此外,我们提议在培训数据不能充分反映VAS任务的测试时间分布时,在决策时间改进政策领域适应技术。我们通过对几个卫星图像数据集进行广泛的实验,表明拟议的方法大大超越了几个强有力的基线。代码和数据将予公布。