Recent RGBD-based models for saliency detection have attracted research attention. The depth clues such as boundary clues, surface normal, shape attribute, etc., contribute to the identification of salient objects with complicated scenarios. However, most RGBD networks require multi-modalities from the input side and feed them separately through a two-stream design, which inevitably results in extra costs on depth sensors and computation. To tackle these inconveniences, we present in this paper a novel fusion design named modality-guided subnetwork (MGSnet). It has the following superior designs: 1) Our model works for both RGB and RGBD data, and dynamically estimating depth if not available. Taking the inner workings of depth-prediction networks into account, we propose to estimate the pseudo-geometry maps from RGB input - essentially mimicking the multi-modality input. 2) Our MGSnet for RGB SOD results in real-time inference but achieves state-of-the-art performance compared to other RGB models. 3) The flexible and lightweight design of MGS facilitates the integration into RGBD two-streaming models. The introduced fusion design enables a cross-modality interaction to enable further progress but with a minimal cost.
翻译:最近基于RGBD的显著探测模型引起了研究的注意。深度线索,如边界线索、表面正常、形状属性等,有助于识别具有复杂情景的突出对象。然而,大多数RGBD网络需要输入方的多模式,并通过双流设计分别喂养它们,这不可避免地造成深度传感器和计算方面的额外费用。为了解决这些不便,我们在本文件中提出了一个名为模式引导子网络(MGSnet)的新型聚合设计。它有以下优异设计:(1) 我们的RGB和RGBD数据模型以及动态估计深度(如果没有的话),有助于动态估计深度网络的内部工作。我们建议从RGB投入中估算假地理测量图――基本上模拟多模式投入和计算。(2) 我们的RGB SOD MGSnet产生实时推理,但与其他RGB模型相比,实现了最新水平的性能。(3) 灵活和轻重的MSG设计有助于将深度网络纳入RGBD的深度网络内部工作,我们建议根据RGBD输入的二流模型来估计假地理测量图――基本上模拟。我们为RGBD的跨流模型提供了最起码的成本设计。