Buildings' segmentation is a fundamental task in the field of earth observation and aerial imagery analysis. Most existing deep learning based algorithms in the literature can be applied on fixed or narrow-ranged spatial resolution imagery. In practical scenarios, users deal with a wide spectrum of images resolution and thus, often need to resample a given aerial image to match the spatial resolution of the dataset used to train the deep learning model. This however, would result in a severe degradation in the quality of the output segmentation masks. To deal with this issue, we propose in this research a Scale-invariant neural network (Sci-Net) that is able to segment buildings present in aerial images at different spatial resolutions. Specifically, we modified the U-Net architecture and fused it with dense Atrous Spatial Pyramid Pooling (ASPP) to extract fine-grained multi-scale representations. We compared the performance of our proposed model against several state of the art models on the Open Cities AI dataset, and showed that Sci-Net provides a steady improvement margin in performance across all resolutions available in the dataset.
翻译:建筑分割是地球观测和航空图像分析领域的一项基本任务。文献中现有的基于深层学习的算法大多可以应用于固定或窄距离空间分辨率图像。在实际情景中,用户处理的图像分辨率范围很广,因此,往往需要对特定航空图像进行再抽样,以匹配用于培训深层学习模型的数据集的空间分辨率。然而,这将导致产出分离面罩的质量严重下降。为处理这一问题,我们在本研究中提议建立一个规模变化性神经网络(Sci-Net),它能够在不同空间分辨率的航空图像中显示成块结构。具体地说,我们修改了U-Net结构,并将其与密集的Atrom Syyramid 集合(ASPP)相结合,以提取精细微的多尺度图示。我们比较了我们提议的模型的性能与开放城市独立数据集上的若干艺术模型的状态,并表明Sci-Net在数据集中所有分辨率的性能方面提供了稳定的改进余地。