Power lines pose a significant safety threat to unmanned aerial vehicles (UAVs) operating at low altitudes. However, detecting power lines in aerial images is challenging due to the small size of the foreground data (i.e., power lines) and the abundance of background information. To address this challenge, we propose DUFormer, a semantic segmentation algorithm designed specifically for power line detection in aerial images. We assume that performing sufficient feature extraction with a convolutional neural network (CNN) that has a strong inductive bias is beneficial for training an efficient Transformer model. To this end, we propose a heavy token encoder responsible for overlapping feature re-mining and tokenization. The encoder comprises a pyramid CNN feature extraction module and a power line feature enhancement module. Following sufficient feature extraction for power lines, the feature fusion is carried out, and then the Transformer block is used for global modeling. The final segmentation result is obtained by fusing local and global features in the decode head. Additionally, we demonstrate the significance of the joint multi-weight loss function in power line segmentation. The experimental results demonstrate that our proposed method achieves the state-of-the-art performance in power line segmentation on the publicly available TTPLA dataset.
翻译:电力线路对于低空无人机的安全构成了重要威胁。然而,在航拍图像中检测电力线路是具有挑战性的,这是由于前景数据(即电力线路)的大小很小,而背景信息则很丰富。为了应对这一挑战,我们提出了DUFormer,这是一种专门用于航拍图像中电力线路检测的语义分割算法。我们假设使用具有强归纳偏差的卷积神经网络(CNN)进行足够的特征提取对训练高效Transformer模型是有益的。为此,我们提出了一个重量级的token编码器,以实现重叠特征重新挖掘和分词。编码器包括一个金字塔形CNN特征提取模块和一个电力线特征增强模块。在完成对电力线的足够特征提取后,进行特征融合,然后使用Transformer块进行全局建模。最终分割结果是通过在解码头中融合局部和全局特征获得的。此外,我们证明了联合多权重损失函数在电力线路分割中的重要性。实验结果表明,我们提出的方法在公开可用的TTPLA数据集中实现了最先进的性能。