Contrastive learning methods have significantly narrowed the gap between supervised and unsupervised learning on computer vision tasks. In this paper, we explore their application to remote sensing, where unlabeled data is often abundant but labeled data is scarce. We first show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks. To close the gap, we propose novel training methods that exploit the spatiotemporal structure of remote sensing data. We leverage spatially aligned images over time to construct temporal positive pairs in contrastive learning and geo-location to design pre-text tasks. Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing and other geo-tagged image datasets
翻译:在本文中,我们探索了在遥感中应用这些数据,因为未贴标签的数据往往非常丰富,但标签数据很少。我们首先表明,由于其不同的特点,在标准基准的对比性学习与监督性学习之间存在非三角差距。为了缩小差距,我们提出了新的培训方法,以利用遥感数据的空间时空结构。我们利用空间对齐图像,在对比学习和地理定位中,在设计预文本任务时,建立时间正对。我们的实验表明,我们提议的方法缩小了在图像分类、物体探测和图像图象图谱分析方面的对比性学习与监督性学习之间的差距。