Semantic segmentation for extracting buildings and roads, from unmanned aerial vehicle (UAV) remote sensing images by deep learning becomes a more efficient and convenient method than traditional manual segmentation in surveying and mapping field. In order to make the model lightweight and improve the model accuracy, A Lightweight Network Using OC-Transformer (LOCT) for Buildings and Roads from UAV Aerial Remote Sensing Images is proposed. The proposed network adopts an encoder-decoder architecture in which a Lightweight Densely Connected Network (LDCNet) is developed as the encoder. In the decoder part, the dual multi-scale context modules which consist of the Atrous Spatial Pyramid Pooling module (ASPP) and the Object Contextual Transformer module (OC-Transformer) are designed to capture more context information from feature maps of UAV remote sensing images. Between ASPP and OC-Transformer, a Feature Pyramid Network (FPN) module is used to and fuse multi-scale features extracting from ASPP. A private dataset of remote sensing images taken by UAV which contains 2431 training sets, 945 validation sets, and 475 test sets is constructed. The proposed model performs well on this dataset, with only 1.4M parameters and 5.48G floating-point operations (FLOPs), achieving an mean intersection-over-union ratio (mIoU) of 71.12%. More extensive experiments on the public LoveDA dataset and CITY-OSM dataset to further verify the effectiveness of the proposed model with excellent results on mIoU of 65.27% and 74.39%, respectively. The source code will be made available on https://github.com/GtLinyer/LOCT .
翻译:从无人驾驶飞行器(UAV)深层学习的遥感图像中提取建筑物和道路的语义分解,从无人驾驶飞行器(UAV)深层学习的遥感图像中提取建筑物和道路的语义分解,比传统的勘测和绘图场域手工分解法更具效率和方便的方法。为了使模型轻量重量并改进模型精度准确性,提议为UAV空气遥感图像中的建筑物和道路使用OC-Transerform(OC-Transer)光量网络。拟议网络采用一个编码解码解码解码结构,即轻量的广度连接网络(LDCNet)作为编码器。在解码部分,由Atrosm Smace Pyramid 集合模块(ASP)和天体环境变光度变光仪模块(OC-Transerformormationeration)组成的双层双层环境环境模块(OVAVO-GM)组成了双层的遥感图像数据集,其中含有931个数据测试数据集,并在目前测试源中进行了更多的数据测试。</s>