Recent RGB-D semantic segmentation has motivated research interest thanks to the accessibility of complementary modalities from the input side. Existing works often adopt a two-stream architecture that processes photometric and geometric information in parallel, with few methods explicitly leveraging the contribution of depth cues to adjust the sampling position on RGB images. In this paper, we propose a novel framework to incorporate the depth information in the RGB convolutional neural network (CNN), termed Z-ACN (Depth-Adapted CNN). Specifically, our Z-ACN generates a 2D depth-adapted offset which is fully constrained by low-level features to guide the feature extraction on RGB images. With the generated offset, we introduce two intuitive and effective operations to replace basic CNN operators: depth-adapted convolution and depth-adapted average pooling. Extensive experiments on both indoor and outdoor semantic segmentation tasks demonstrate the effectiveness of our approach.
翻译:最近的 RGB-D 语义分解由于能够从输入方获得补充模式而激发了研究兴趣; 现有的工程往往采用双流结构,同时处理光度和几何信息,没有多少方法明确利用深度提示来调整RGB图像的取样位置; 在本文中,我们提议了一个新颖的框架,将深度信息纳入RGB革命神经网络,称为Z-ACN(Depeh-Adapted CNN),具体地说,我们的Z-ACN产生一个2D 深度适应性抵消,由于低度特征而完全受限制,以指导RGB图像的特征提取。 有了这种抵消,我们引入了两种直观和有效的操作,以取代基本的CNN操作器:深度适应进化进化和深度适应平均集合。 对室内和室外语系分解任务进行了广泛的实验,展示了我们的方法的有效性。