Learning dense point-wise semantics from unstructured 3D point clouds with fewer labels, although a realistic problem, has been under-explored in literature. While existing weakly supervised methods can effectively learn semantics with only a small fraction of point-level annotations, we find that the vanilla bounding box-level annotation is also informative for semantic segmentation of large-scale 3D point clouds. In this paper, we introduce a neural architecture, termed Box2Seg, to learn point-level semantics of 3D point clouds with bounding box-level supervision. The key to our approach is to generate accurate pseudo labels by exploring the geometric and topological structure inside and outside each bounding box. Specifically, an attention-based self-training (AST) technique and Point Class Activation Mapping (PCAM) are utilized to estimate pseudo-labels. The network is further trained and refined with pseudo labels. Experiments on two large-scale benchmarks including S3DIS and ScanNet demonstrate the competitive performance of the proposed method. In particular, the proposed network can be trained with cheap, or even off-the-shelf bounding box-level annotations and subcloud-level tags.
翻译:从结构化的3D点云中学习密集的点度语义学,标签较少,尽管这是一个现实的问题,但在文献中却未得到充分探讨。虽然现有的受监管薄弱的方法能够有效地学习语义学,只有一小部分点说明,但我们发现,香草捆绑箱层次的注解对于大型3D点云的语义分解也很有用。在本文中,我们引入了一个神经结构,称为Box2Seg,以学习3D点云的点语义学,带有约束式箱级监督。我们的方法的关键在于通过探索每个边框内外的几何和地形结构来生成准确的假名词。具体地说,一种基于关注的自我训练(AST)技术和点分级活化绘图(PCAM)被用于估算大型3D点云的语义分解。网络经过进一步培训,并用假标签加以完善。在两个大型基准上进行的实验,包括S3DIS和ScenNet,展示了拟议方法的竞争性性表现。特别是,拟议的网络可以通过廉价的甚低价、甚至离面的标签水平来训练。