Due to the scarcity of labeled data, using supervised models pre-trained on ImageNet is a de facto standard in remote sensing scene classification. Recently, the availability of larger high resolution remote sensing (HRRS) image datasets and progress in self-supervised learning have brought up the questions of whether supervised ImageNet pre-training is still necessary for remote sensing scene classification and would supervised pre-training on HRRS image datasets or self-supervised pre-training on ImageNet achieve better results on target remote sensing scene classification tasks. To answer these questions, in this paper we both train models from scratch and fine-tune supervised and self-supervised ImageNet models on several HRRS image datasets. We also evaluate the transferability of learned representations to HRRS scene classification tasks and show that self-supervised pre-training outperforms the supervised one, while the performance of HRRS pre-training is similar to self-supervised pre-training or slightly lower. Finally, we propose using an ImageNet pre-trained model combined with a second round of pre-training using in-domain HRRS images, i.e. domain-adaptive pre-training. The experimental results show that domain-adaptive pre-training results in models that achieve state-of-the-art results on HRRS scene classification benchmarks. The source code and pre-trained models are available at \url{https://github.com/risojevicv/RSSC-transfer}.
翻译:由于标签数据稀缺,使用在图像网上预先培训的受监督模型,使用在图像网上预先培训的模型,是遥感场景分类的一个事实上的标准。最近,在提供更大的高分辨率遥感图像数据集和自我监督学习的进展方面,提出了以下问题:在遥感场景分类方面,受监督的图像网初步培训是否仍然必要,将监督关于遥感场景分类的受监督培训前的训练,在图像网上自我监督的图像数据集培训前,在目标遥感场景分类任务方面,取得更好的结果。为了回答这些问题,我们在本论文中既从零到微微微调、受监管和自监督的图像网模型,也在几个HRRS图像数据集中培训前的模型和自监督的图像网模型。我们还评估了向HRRS现场分类任务转移的学习经验,并表明自监督前培训前的训练前比受监督的要强,而HRRS培训前的训练前训练前的训练前训练阶段,我们提议使用经过培训的模型与第二轮培训前训练前的训练前训练前,在HRRS图像、i-域域域域域训练前的分类结果,在试验前训练前的训练前的试验前的模型中,将取得。