Due to the complicated background and noise of infrared images, infrared small target detection is one of the most difficult problems in the field of computer vision. In most existing studies, semantic segmentation methods are typically used to achieve better results. The centroid of each target is calculated from the segmentation map as the detection result. In contrast, we propose a novel end-to-end framework for infrared small target detection and segmentation in this paper. First, with the use of UNet as the backbone to maintain resolution and semantic information, our model can achieve a higher detection accuracy than other state-of-the-art methods by attaching a simple anchor-free head. Then, a pyramid pool module is used to further extract features and improve the precision of target segmentation. Next, we use semantic segmentation tasks that pay more attention to pixel-level features to assist in the training process of object detection, which increases the average precision and allows the model to detect some targets that were previously not detectable. Furthermore, we develop a multi-task framework for infrared small target detection and segmentation. Our multi-task learning model reduces complexity by nearly half and speeds up inference by nearly twice compared to the composite single-task model, while maintaining accuracy. The code and models are publicly available at https://github.com/Chenastron/MTUNet.
翻译:由于红外图的复杂背景和噪音,红外线小目标探测是计算机视觉领域最困难的问题之一。在大多数现有研究中,通常使用语义分割法来取得更好的结果。每个目标的中间体作为检测结果,从分层图中计算出每个目标的中间体。相反,我们提议为红外线小目标探测和本文中的分离提出一个新的端对端框架。首先,利用UNet作为主干来保持分辨率和语义信息,我们的模型可以通过附加一个简单的无锚头来达到比其他最先进的方法更高的检测精确度。然后,使用金字塔库模块来进一步提取特性,提高目标分割的精确度。接下来,我们使用语义分离任务,更多地注意像素级特征,以协助物体探测的培训过程。这样可以提高平均精确度,并使模型能够探测以前无法探测到的一些目标。此外,我们开发了一个红外线小目标探测和分层方法的多塔克框架。我们多塔学习模型将复杂性降低近一半,提高目标分解的精确度。我们使用的多塔学习模型在综合模型上几乎两次,同时保持单一的精确度。