Encoder-decoder models have been widely used in RGBD semantic segmentation, and most of them are designed via a two-stream network. In general, jointly reasoning the color and geometric information from RGBD is beneficial for semantic segmentation. However, most existing approaches fail to comprehensively utilize multimodal information in both the encoder and decoder. In this paper, we propose a novel attention-based dual supervised decoder for RGBD semantic segmentation. In the encoder, we design a simple yet effective attention-based multimodal fusion module to extract and fuse deeply multi-level paired complementary information. To learn more robust deep representations and rich multi-modal information, we introduce a dual-branch decoder to effectively leverage the correlations and complementary cues of different tasks. Extensive experiments on NYUDv2 and SUN-RGBD datasets demonstrate that our method achieves superior performance against the state-of-the-art methods.
翻译:在 RGBD 语义分解中广泛使用了编码器脱coder 模型,其中多数是通过双流网络设计的,一般来说,共同推理 RGBD 的颜色和几何信息有利于语义分解,但是,大多数现有方法未能在编码器和解码器中全面利用多式联运信息。在本文件中,我们提议为RGBD 语义分解设计一个新的关注基双重监督解码器。在编码器中,我们设计了一个简单而有效的关注基多层次补充信息模块,以提取和整合深度多层次的对称补充信息。为学习更强有力的深度表达和丰富的多模式信息,我们引入了双管解码器,以有效地利用不同任务的相关性和互补线索。关于NUDUDv2 和 SUN-RGBD 数据集的广泛实验表明,我们的方法与最先进的方法相比,其性能更高。