Existing Earth Vision datasets are either suitable for semantic segmentation or object detection. In this work, we introduce the first benchmark dataset for instance segmentation in aerial imagery that combines instance-level object detection and pixel-level segmentation tasks. In comparison to instance segmentation in natural scenes, aerial images present unique challenges e.g., a huge number of instances per image, large object-scale variations and abundant tiny objects. Our large-scale and densely annotated Instance Segmentation in Aerial Images Dataset (iSAID) comes with 655,451 object instances for 15 categories across 2,806 high-resolution images. Such precise per-pixel annotations for each instance ensure accurate localization that is essential for detailed scene analysis. Compared to existing small-scale aerial image based instance segmentation datasets, iSAID contains 15$\times$ the number of object categories and 5$\times$ the number of instances. We benchmark our dataset using two popular instance segmentation approaches for natural images, namely Mask R-CNN and PANet. In our experiments we show that direct application of off-the-shelf Mask R-CNN and PANet on aerial images provide suboptimal instance segmentation results, thus requiring specialized solutions from the research community. The dataset is publicly available at: https://captain-whu.github.io/iSAID/index.html
翻译:在这项工作中,我们引入了第一个基准数据集,例如在航空图像中进行分解,将试度物体探测和像素分解任务结合起来。与自然场景中的分解任务相比,航空图像提出了独特的挑战,例如,每个图像中有大量实例、大型天体规模变异和大量微小天体。在空中图像数据集(iSAID)中,我们大规模和密集的注解事件分解为2,806高分辨率图像中15类的655,451个对象实例。这种精确的每像素图解确保准确的本地化,这对于详细进行场景分析至关重要。与现有的小规模空中图像分解相比,iSAID包含15美元天体类别数和5美元实例数。我们用两种受欢迎的分解方法对我们的数据集进行基准,即MASS-CNN和PANNet。在我们的实验中,我们展示了离层图像的直接应用,因此需要基于Shelf AS-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADADADADADADSADADADADA, SAMAS SADSADADADADADADADADADADA, SAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMASADADADADADADADADADADADADADADADADADADADADA, SAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMADADADADADADADADADADADAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMAMA