Improving the accuracy of fire detection using infrared night vision cameras remains a challenging task. Previous studies have reported strong performance with popular detection models. For example, YOLOv7 achieved an mAP50-95 of 0.51 using an input image size of 640 x 1280, RT-DETR reached an mAP50-95 of 0.65 with an image size of 640 x 640, and YOLOv9 obtained an mAP50-95 of 0.598 at the same resolution. Despite these results, limitations in dataset construction continue to cause issues, particularly the frequent misclassification of bright artificial lights as fire. This report presents three main contributions: an additional NIR dataset, a two-stage detection model, and Patched-YOLO. First, to address data scarcity, we explore and apply various data augmentation strategies for both the NIR dataset and the classification dataset. Second, to improve night-time fire detection accuracy while reducing false positives caused by artificial lights, we propose a two-stage pipeline combining YOLOv11 and EfficientNetV2-B0. The proposed approach achieves higher detection accuracy compared to previous methods, particularly for night-time fire detection. Third, to improve fire detection in RGB images, especially for small and distant objects, we introduce Patched-YOLO, which enhances the model's detection capability through patch-based processing. Further details of these contributions are discussed in the following sections.
翻译:利用红外夜视摄像头提升火焰检测的准确性仍然是一项具有挑战性的任务。先前的研究报告了流行检测模型的优异性能。例如,YOLOv7在输入图像尺寸为640 x 1280时取得了0.51的mAP50-95,RT-DETR在图像尺寸为640 x 640时达到了0.65的mAP50-95,而YOLOv9在相同分辨率下获得了0.598的mAP50-95。尽管取得了这些结果,数据集构建方面的局限性仍然持续引发问题,尤其是频繁将明亮的人造光源误分类为火焰。本报告提出了三项主要贡献:一个额外的近红外数据集、一个两阶段检测模型以及Patched-YOLO。首先,为解决数据稀缺问题,我们探索并应用了多种数据增强策略,分别针对近红外数据集和分类数据集。其次,为提高夜间火焰检测的准确性并减少由人造光引起的误报,我们提出了一个结合YOLOv11与EfficientNetV2-B0的两阶段流程。与先前方法相比,所提出的方法实现了更高的检测精度,特别是在夜间火焰检测方面。第三,为改进RGB图像中的火焰检测,尤其是针对小而远的物体,我们引入了Patched-YOLO,它通过基于图像块的处理方式增强了模型的检测能力。这些贡献的更多细节将在后续章节中讨论。