RGBT salient object detection (SOD) aims to segment the common prominent regions of visible and thermal infrared images. Existing RGBT SOD methods don't fully explore and exploit the potentials of complementarity of different modalities and the global context of image contents, which play a vital role in achieving accurate results. In this paper, we propose a multi-interactive Siamese decoder to mine and model the multi-type interactions for accurate RGBT SOD. In specific, we first encode RGB and thermal image pair into multi-level multi-modal representation. Then, we design a novel Siamese decoder to integrate the multi-level interactions of dual modalities and global contexts. With these interactions, our method works well in diversely challenging scenarios even in the presence of invalid modality. Moreover, the Siamese decoder employs label supervision to drive feature learning in each modality and the modality prejudice is thus suppressed. Finally, we carry out extensive experiments on several benchmark datasets, and the results show that the proposed method achieves the outstanding performance against state-of-the-art algorithms. The source code has released at: https://github.com/lz118/Multi-interactive-Siamese-Decoder.
翻译:RGBT 显要天体探测(SOD) 旨在将常见的显要区域分解为可见红外和热红外图像; 现有的RGBT SOD 方法不完全探索和利用不同模式和图像内容全球背景的互补潜力,这些潜力在取得准确结果方面发挥着至关重要的作用; 在本文中,我们建议为矿山建立一个多互动的Siamese解码器,为准确的RGBT SOD建模多类型互动模型。 具体地说,我们首先将RGB和热图像配对编码成多级多模式代表。 然后,我们设计了一个新型的Siamese 解码器,将双重模式和全球背景的多层次互动结合起来。 有了这些互动,我们的方法即使在无效模式存在的情况下,也在各种挑战性假设中运作良好。 此外, Siamese decoder 使用标签监督来推动每种模式的特征学习,从而抑制了模式的损害。 最后,我们在若干基准数据集上进行了广泛的实验,结果显示, 拟议的方法在州- 艺术运算法中取得了杰出的成绩。 118 源代码已经发布: http://Sgiauth/Deculz- interaz- 。