The application of deep neural networks to remote sensing imagery is often constrained by the lack of ground-truth annotations. Adressing this issue requires models that generalize efficiently from limited amounts of labeled data, allowing us to tackle a wider range of Earth observation tasks. Another challenge in this domain is developing algorithms that operate at variable spatial resolutions, e.g., for the problem of classifying land use at different scales. Recently, self-supervised learning has been applied in the remote sensing domain to exploit readily-available unlabeled data, and was shown to reduce or even close the gap with supervised learning. In this paper, we study self-supervised visual representation learning through the lens of label efficiency, for the task of land use classification on multi-resolution/multi-scale satellite images. We benchmark two contrastive self-supervised methods adapted from Momentum Contrast (MoCo) and provide evidence that these methods can be perform effectively given little downstream supervision, where randomly initialized networks fail to generalize. Moreover, they outperform out-of-domain pretraining alternatives. We use the large-scale fMoW dataset to pretrain and evaluate the networks, and validate our observations with transfer to the RESISC45 dataset.
翻译:深神经网络应用于遥感图像往往因缺乏地面实况说明而受到限制。这一问题的完善需要从有限的贴标签数据中高效地推广模型,以便我们能够应对更广泛的地球观测任务。该领域的另一个挑战是发展以不同空间分辨率运作的算法,例如在不同尺度对土地利用进行分类的问题。最近,在遥感领域应用了自我监督的学习来利用现成的未贴标签的数据,并显示它们缩小或甚至缩小了监督学习的差距。在本文中,我们通过标签效率的透镜研究自我监督的视觉表现学习,以完成多分辨率/多尺度卫星图像的土地利用分类任务。我们根据Momentum Contrast(MoCo)调整了两种对比式自我监督方法,并提供了证据,证明这些方法在下游监督很少的情况下能够有效运行,而随机初始化的网络无法普及。此外,这些方法超越了受监督学习前的替代方法。我们用大规模FMoWSIS系统的数据转换了我们的数据系统,并用数据系统前置数据系统对数据系统进行了测试。