Remote sensing data is crucial for applications ranging from monitoring forest fires and deforestation to tracking urbanization. Most of these tasks require dense pixel-level annotations for the model to parse visual information from limited labeled data available for these satellite images. Due to the dearth of high-quality labeled training data in this domain, there is a need to focus on semi-supervised techniques. These techniques generate pseudo-labels from a small set of labeled examples which are used to augment the labeled training set. This makes it necessary to have a highly representative and diverse labeled training set. Therefore, we propose to use an active learning-based sampling strategy to select a highly representative set of labeled training data. We demonstrate our proposed method's effectiveness on two existing semantic segmentation datasets containing satellite images: UC Merced Land Use Classification Dataset and DeepGlobe Land Cover Classification Dataset. We report a 27% improvement in mIoU with as little as 2% labeled data using active learning sampling strategies over randomly sampling the small set of labeled training data.
翻译:遥感数据对于从监测森林火灾和砍伐森林到跟踪城市化等各种应用至关重要,其中多数任务要求该模型的密集像素级说明,以便从这些卫星图像现有的有限标签数据中分析视觉信息。由于该领域缺少高质量的标签培训数据,因此有必要把重点放在半监督技术上。这些技术从一小组标签示例中产生假标签,用来扩大标签培训集。这就需要有一个具有高度代表性和多样性的标签培训组。因此,我们提议使用一种积极的学习抽样战略来选择一套具有高度代表性的标签培训数据组。我们展示了我们所提议的方法在两个包含卫星图像的现有语系分类数据集上的有效性:UC Merced Land Ause S分类数据集和DeepGlobe Landlobe Lloover Cover分类数据集。我们报告说,在MIoU中,有27%的改进,只有2%的标签数据,使用积极学习的抽样战略,而不是随机抽样小类培训数据组。