缩小交换差距对无人监督的美国航天局MODIS仪器机械学习模型的影响 (Reducing Effects of Swath Gaps on Unsupervised Machine Learning Models for NASA MODIS Instruments)

Due to the nature of their pathways, NASA Terra and NASA Aqua satellites capture imagery containing swath gaps, which are areas of no data. Swath gaps can overlap the region of interest (ROI) completely, often rendering the entire imagery unusable by Machine Learning (ML) models. This problem is further exacerbated when the ROI rarely occurs (e.g. a hurricane) and, on occurrence, is partially overlapped with a swath gap. With annotated data as supervision, a model can learn to differentiate between the area of focus and the swath gap. However, annotation is expensive and currently the vast majority of existing data is unannotated. Hence, we propose an augmentation technique that considerably removes the existence of swath gaps in order to allow CNNs to focus on the ROI, and thus successfully use data with swath gaps for training. We experiment on the UC Merced Land Use Dataset, where we add swath gaps through empty polygons (up to 20 percent areas) and then apply augmentation techniques to fill the swath gaps. We compare the model trained with our augmentation techniques on the swath gap-filled data with the model trained on the original swath gap-less data and note highly augmented performance. Additionally, we perform a qualitative analysis using activation maps that visualizes the effectiveness of our trained network in not paying attention to the swath gaps. We also evaluate our results with a human baseline and show that, in certain cases, the filled swath gaps look so realistic that even a human evaluator did not distinguish between original satellite images and swath gap-filled images. Since this method is aimed at unlabeled data, it is widely generalizable and impactful for large scale unannotated datasets from various space data domains.

翻译：由于其路径的性质,美国航天局Terra和NASA Aqua卫星捕捉了包含严重差距的图像,这是没有数据的领域。 Swath差距可以完全重叠感兴趣的区域(ROI),这往往使整个图像无法被机器学习模型(ML)模式所利用。当ROI很少发生(例如飓风)时,这个问题就更加严重,并且随着发生,部分地与Swath差距重叠。有了附加说明的数据作为监管,模型可以学会区分焦点领域和差距。然而,批注费用昂贵,目前绝大多数现有数据是未经注解的。因此,我们提出了一种增强技术,大大消除了存在的巨大差距,以使CNN能够专注于ROI(M)模型,从而成功地使用带有严重差距的数据来进行培训。我们在UC Merced Land Ause数据集上进行了实验,我们通过空多边形(高达20%的地区)增加了差距,然后运用增强技术来填补差距。我们把所培训的模型的模型与未加标记的模型相比,我们没有加标记的图像中的未加标记。