用于从UAV遥感图像中提取建筑物和道路的轻量级网络LOANet:使用对象注意力 (LOANet: A Lightweight Network Using Object Attention for Extracting Buildings and Roads from UAV Aerial Remote Sensing Images)

Semantic segmentation for extracting buildings and roads, from unmanned aerial vehicle (UAV) remote sensing images by deep learning becomes a more efficient and convenient method than traditional manual segmentation in surveying and mapping field. In order to make the model lightweight and improve the model accuracy, A Lightweight Network Using Object Attention (LOANet) for Buildings and Roads from UAV Aerial Remote Sensing Images is proposed. The proposed network adopts an encoder-decoder architecture in which a Lightweight Densely Connected Network (LDCNet) is developed as the encoder. In the decoder part, the dual multi-scale context modules which consist of the Atrous Spatial Pyramid Pooling module (ASPP) and the Object Attention Module (OAM) are designed to capture more context information from feature maps of UAV remote sensing images. Between ASPP and OAM, a Feature Pyramid Network (FPN) module is used to and fuse multi-scale features extracting from ASPP. A private dataset of remote sensing images taken by UAV which contains 2431 training sets, 945 validation sets, and 475 test sets is constructed. The proposed model performs well on this dataset, with only 1.4M parameters and 5.48G floating-point operations (FLOPs), achieving a mean intersection-over-union ratio (mIoU) of 71.12%. More extensive experiments on the public LoveDA dataset and CITY-OSM dataset to further verify the effectiveness of the proposed model with excellent results on mIoU of 65.27% and 74.39%, respectively.

翻译：语义分割通过深度学习从无人机（UAV）遥感图像中提取建筑物和道路，比传统的手动分割在测绘领域更加高效和方便。为了使模型轻量级并提高模型的准确性，提出了一种用于从UAV遥感图像中提取建筑物和道路的轻量级网络，该网络使用对象注意（LOANet）。所提出的网络采用编码器-解码器架构，其中开发了轻量级密集连接网络（LDCNet）作为编码器。在解码器部分，设计了双重多尺度上下文模块，由空洞空间金字塔池化模块（ASPP）和对象注意模块（OAM）组成，以从UAV遥感图像的特征映射中捕获更多的上下文信息。在ASPP和OAM之间，使用Feature Pyramid Network（FPN）模块来提取和融合从ASPP提取的多尺度特征。构建了包含2431个训练集，945个验证集和475个测试集的无人机遥感图像的私有数据集。所提出的模型在这个数据集上表现良好，只有140万个参数和5.48G浮点数操作（FLOPs），实现了71.12%的平均交集-联合比率（mIoU）。在公共LoveDA数据集和CITY-OSM数据集上进行了更广泛的实验，以进一步验证所提出的模型的有效性，并取得了非常好的mIoU结果，分别为65.27%和74.39%。