RGB-D saliency detection has attracted increasing attention, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing works often focus on learning a shared representation through various fusion strategies, with few methods explicitly considering how to preserve modality-specific characteristics. In this paper, taking a new perspective, we propose a specificity-preserving network (SP-Net) for RGB-D saliency detection, which benefits saliency detection performance by exploring both the shared information and modality-specific properties (e.g., specificity). Specifically, two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. A cross-enhanced integration module (CIM) is proposed to fuse cross-modal features in the shared learning network, which are then propagated to the next layer for integrating cross-level information. Besides, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder, which can provide rich complementary multi-modal information to boost the saliency detection performance. Further, a skip connection is used to combine hierarchical features between the encoder and decoder layers. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods. Code is available at: https://github.com/taozh2017/SPNet.
翻译:RGB-D显性检测因其有效性以及现在可以方便地捕捉到深度信号这一事实而日益引起人们的注意;现有工作往往侧重于通过各种融合战略学习共享代表制,而没有多少方法明确考虑如何保持模式特性;在本文件中,我们从新的角度出发,提议为RGB-D显性检测建立一个特殊保护网络(SP-Net),通过探索共享信息和模式特性(例如特殊性),有利于显著检测性能;具体地说,采用了两个具体模式的网络和一个共享学习网络,以生成个人和共享的显著地图;提议了一个跨强化的整合模块(CIM)将共享学习网络的跨模式特征结合起来,然后将其传播到下一个整合跨层次信息的层次;此外,我们提议了一个多模式特征汇总模块,将每个个人解码器的具体特点纳入共享的解码中;具体地说,可以提供丰富的互补的多模式信息,以提升显著的检测性功能;此外,一个跨网络整合的整合了共享学习网络模块模块(CIM)的跨式模块(CIM)模块(CE-Diralder Serviolder)/Serviews 。