For semantic segmentation in urban scene understanding, RGB cameras alone often fail to capture a clear holistic topology, especially in challenging lighting conditions. Thermal signal is an informative additional channel that can bring to light the contour and fine-grained texture of blurred regions in low-quality RGB image. Aiming at RGB-T (thermal) segmentation, existing methods either use simple passive channel/spatial-wise fusion for cross-modal interaction, or rely on heavy labeling of ambiguous boundaries for fine-grained supervision. We propose a Spatial-aware Demand-guided Recursive Meshing (SpiderMesh) framework that: 1) proactively compensates inadequate contextual semantics in optically-impaired regions via a demand-guided target masking algorithm; 2) refines multimodal semantic features with recursive meshing to improve pixel-level semantic analysis performance. We further introduce an asymmetric data augmentation technique M-CutOut, and enable semi-supervised learning to fully utilize RGB-T labels only sparsely available in practical use. Extensive experiments on MFNet and PST900 datasets demonstrate that SpiderMesh achieves new state-of-the-art performance on standard RGB-T segmentation benchmarks.
翻译:在对城市景象的了解中,光是RGB摄像头往往无法捕捉出清晰的整体地形学,特别是在具有挑战性的照明条件下。热信号是一个信息化的额外渠道,能够点亮低质量RGB图像中模糊区域的轮廓和细微磨痕质。针对RGB-T(热)分解,现有方法要么使用简单的被动通道/空间-空间混合来进行跨模式互动,要么依靠对模糊界限的重标签来进行细微监督。我们提议了一个空间觉醒需求引导的重新扫描仪(SpiderMesh)框架,这一框架可以通过需求导向的目标掩码算法,积极补偿在光学障碍区域中光学障碍区域不适当的背景语义;(2)改进多式语系特征,用循环式中间线来改进像素级静脉冲分析的性能。我们进一步引入不对称的数据增强技术M-Cutut,并允许半超超式学习充分利用RGB-T标签,在实际应用中只能以稀暗方式使用新的Smal-M-MS-MS-SDS-S-SDSD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SALD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-S-SD-SD-SD-SD-S-S-S-SD-SD-S-SD-SD-SD-SD-SD-SD-S-SD-SDSD-SD-SDSDSDSDSDSDSDSDSD-SD-SB-SD-SBSD-S-S-S-S-SD-S-S-S-S-SD-SD-SD-S-S-SD-SD-SD-SD-S</s>