For semantic segmentation of remote sensing images (RSI), trade-off between representation power and location accuracy is quite important. How to get the trade-off effectively is an open question, where current approaches of utilizing attention schemes or very deep models result in complex models with large memory consumption. Compared with the popularly-used convolutional neural network (CNN) with fixed square kernels, graph convolutional network (GCN) can explicitly utilize correlations between adjacent land covers and conduct flexible convolution on arbitrarily irregular image regions. However, the problems of large variations of target scales and blurred boundary cannot be easily solved by GCN, while densely connected atrous convolution network (DenseAtrousCNet) with multi-scale atrous convolution can expand the receptive fields and obtain image global information. Inspired by the advantages of both GCN and Atrous CNN, a two-stream deep neural network for semantic segmentation of RSI (RSI-Net) is proposed in this paper to obtain improved performance through modeling and propagating spatial contextual structure effectively and a novel decoding scheme with image-level and graph-level combination. Extensive experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets, where the comparison results demonstrate the superior performance of RSI-Net in terms of overall accuracy, F1 score and kappa coefficient when compared with six state-of-the-art RSI semantic segmentation methods.
翻译:对于遥感图像(RSI)的语义分解而言,代表权力和位置精确度之间的权衡是十分重要的。如何有效地实现权衡交易是一个尚未解决的问题,因为目前利用关注计划或非常深模型的方法导致大量记忆消耗的复杂模型。与普遍使用的具有固定平方内核的革命神经网络相比,图形相联网络可以明确利用相邻土地覆盖层和对任意的不正常图像区域进行灵活的变异。然而,目标规模和模糊边界的巨大变异问题很难由GCN解决,而与多尺度的振动相联的亚化网络(denseAtrousCNet)可以扩大接收场并获得图像全球信息。受GCN和Atorus NCNN的优势的启发,本文中提出了用于RSI(RSI-Net)的二次深层神经分解网络网络,以便通过模拟和有效传播空间背景结构以及模糊边界结构以及一个与图像级和图像级网络化的分解方案,从而展示了RIS1的图像级和图表级总体性能学结果。