Long-range context information is crucial for the semantic segmentation of High-Resolution (HR) Remote Sensing Images (RSIs). The image cropping operations, commonly used for training neural networks, limit the perception of long-range context information in large RSIs. To break this limitation, we propose a Wide-Context Network (WiCoNet) for the semantic segmentation of HR RSIs. In the WiCoNet, apart from a conventional feature extraction network that aggregates the local information, an extra context branch is designed to explicitly model the spatial information in a larger image area. The information between the two branches is communicated through a Context Transformer, which is a novel design derived from the Vision Transformer to model the long-range context correlations. Ablation studies and comparative experiments conducted on several benchmark datasets prove the effectiveness of the proposed method. In addition, we present a new Beijing Land-Use (BLU) dataset. This is a large-scale HR satellite dataset provided with high-quality and fine-grained reference labels, which can boost future studies in this field.
翻译:远程背景信息对于高分辨率遥感图象的语义分解至关重要。通常用于培训神经网络的图像裁剪作业限制了大型登记册系统对长距离背景信息的感知。为了打破这一限制,我们提议为HRRSS的语义分解建立一个宽链网络(WiCoNet)。在WiCoNet中,除了一个汇集当地信息的常规特征提取网络外,还设计了一个额外的上下文分支,以在更大的图像区域中明确模拟空间信息。两个分支之间的信息通过一个环境变换器进行传播,这是由视野变换器衍生的新设计,以模拟远程背景关系。在几个基准数据集上进行的调整研究和比较实验证明了拟议方法的有效性。此外,我们提出了一个新的北京土地使用数据集。这是一个大型的HR卫星数据集,配有高质量和精细的参考标签,可以促进该领域的未来研究。