High-resolution satellite images can provide abundant, detailed spatial information for land cover classification, which is particularly important for studying the complicated built environment. However, due to the complex land cover patterns, the costly training sample collections, and the severe distribution shifts of satellite imageries, few studies have applied high-resolution images to land cover mapping in detailed categories at large scale. To fill this gap, we present a large-scale land cover dataset, Five-Billion-Pixels. It contains more than 5 billion labeled pixels of 150 high-resolution Gaofen-2 (4 m) satellite images, annotated in a 24-category system covering artificial-constructed, agricultural, and natural classes. In addition, we propose a deep-learning-based unsupervised domain adaptation approach that can transfer classification models trained on labeled dataset (referred to as the source domain) to unlabeled data (referred to as the target domain) for large-scale land cover mapping. Specifically, we introduce an end-to-end Siamese network employing dynamic pseudo-label assignment and class balancing strategy to perform adaptive domain joint learning. To validate the generalizability of our dataset and the proposed approach across different sensors and different geographical regions, we carry out land cover mapping on five megacities in China and six cities in other five Asian countries severally using: PlanetScope (3 m), Gaofen-1 (8 m), and Sentinel-2 (10 m) satellite images. Over a total study area of 60,000 square kilometers, the experiments show promising results even though the input images are entirely unlabeled. The proposed approach, trained with the Five-Billion-Pixels dataset, enables high-quality and detailed land cover mapping across the whole country of China and some other Asian countries at meter-resolution.
翻译:高分辨率卫星图像可以提供丰富、详细的空间信息,对于研究复杂的人类建筑环境尤为重要。然而,由于复杂的土地覆盖模式、昂贵的训练样本收集和卫星影像的严重分布偏移,很少有研究将高分辨率图像应用于大规模的土地覆盖分类。为了填补这一空白,我们提出了一个大规模土地覆盖数据集——“五十亿像素”。它包含了150幅高分辨率高分二号(4米)卫星图像的超过50亿个像素,按照囊括人造建筑、农业和自然类别的24个分类系统进行了标注。此外,我们提出了一种基于深度学习的无监督域自适应方法,可以将基于已标注数据集(称为源域)训练的分类模型转化到未标注的数据(称为目标域)上,以实现大规模土地覆盖制图。具体来说,我们采用了一种端到端的连体网络,采用动态伪标签分配和类平衡策略进行自适应域联合学习。为了验证我们的数据集和提出的方法在不同传感器和地理区域的通用性,我们使用PlanetScope(3米)、高分一号(8米)和Sentinel-2(10米)卫星图像在中国五个超大城市和其他5个亚洲国家的6个城市展开土地覆盖制图。虽然图像输入完全没有标签,但在总共的6万平方公里的研究区域内,实验结果显示出了很好的性能。我们基于“五十亿像素”数据集训练的方法,可以在全中国和一些亚洲国家的毫米级分辨率上实现高质量、详细的土地覆盖制图。