Long-range context information is crucial for the semantic segmentation of High-Resolution (HR) Remote Sensing Images (RSIs). The image cropping operations, commonly used for training neural networks, limit the perception of long-range context information in large RSIs. To break this limitation, we propose a Wider-Context Network (WiCNet) for the semantic segmentation of HR RSIs. In the WiCNet, apart from a conventional feature extraction network to aggregate the local information, an extra context branch is designed to explicitly model the context information in a larger image area. The information between the two branches is communicated through a Context Transformer, which is a novel design derived from the Vision Transformer to model the long-range context correlations. Ablation studies and comparative experiments conducted on several benchmark datasets prove the effectiveness of the proposed method. Additionally, we present a new Beijing Land-Use (BLU) dataset. This is a large-scale HR satellite dataset provided with high-quality and fine-grained reference labels, which we hope will boost future studies in this field.
翻译:远程背景信息对于高分辨率遥感图象的语义分解至关重要。通常用于培训神经网络的图像裁剪作业限制了大型登记册系统对长距离背景信息的认识。为了打破这一限制,我们提议为HRRSS的语义分解建立一个宽广的文本网络(WiCNet)。在WiCNet,除了常规地物提取网络以汇总当地信息外,还设计了一个额外的上下文分支,以在更大的图像领域明确模拟上下文信息。两个分支之间的信息是通过一个环境变换器传送的,这是从视野变换器中衍生出来的新设计,以模拟长距离环境关系。对几个基准数据集进行的对比研究和比较实验证明了拟议方法的有效性。此外,我们介绍了一个新的北京土地使用数据集。这是一个大型的HR卫星数据集,配有高质量和精细的参考标签,我们希望这将促进今后在这一领域的研究。