Pixel-wise clean annotation is necessary for fully-supervised semantic segmentation, which is laborious and expensive to obtain. In this paper, we propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information, which is much easier to obtain with advanced sensors. We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks. Guided by 3D information, we first generate a point cloud of objects and calculate objectness probability score for each point. Then we project the point cloud with objectness probabilities back to 2D images followed by a refinement step to obtain segment proposals, which are treated as pseudo labels to train a semantic segmentation network. Our method works in a recursive manner to gradually refine the above-mentioned segment proposals. Extensive experimental results on the 2D-3D-S dataset show that the proposed method can generate accurate segment proposals when bounding box labels are available on only a small subset of training images. Performance comparison with recent state-of-the-art methods further illustrates the effectiveness of our method.
翻译:完全监管的语义分解( 等离子部分) 十分昂贵, 且非常昂贵 。 在本文中, 我们提出一个受监管不力的 2D 语义分解模式, 包括了可用 3D 信息的稀散绑定框标签, 这很容易用高级传感器获得。 我们手工将 2D-3D 语义解解解( 2D-3D- S) 数据集的子集贴在捆绑框中, 并引入了 2D-3D 导引模块, 以生成准确的像素分解建议。 在 3D 信息的指导下, 我们首先生成一个目标点云, 计算每个点的天性概率分数。 然后我们用目标性概率回至 2D 图像进行点解算, 并随后采取精细步骤获取分解建议。 我们的方法将2D-3D- S 分解( 2D- 3D- S) 数据集的精细实验结果 显示, 拟议的方法可以产生精确的分块建议, 当约束框图解时, 只能进一步展示我们的业绩分析方法。