Point cloud semantic segmentation often requires largescale annotated training data, but clearly, point-wise labels are too tedious to prepare. While some recent methods propose to train a 3D network with small percentages of point labels, we take the approach to an extreme and propose "One Thing One Click," meaning that the annotator only needs to label one point per object. To leverage these extremely sparse labels in network training, we design a novel self-training approach, in which we iteratively conduct the training and label propagation, facilitated by a graph propagation module. Also, we adopt a relation network to generate per-category prototype and explicitly model the similarity among graph nodes to generate pseudo labels to guide the iterative training. Experimental results on both ScanNet-v2 and S3DIS show that our self-training approach, with extremely-sparse annotations, outperforms all existing weakly supervised methods for 3D semantic segmentation by a large margin, and our results are also comparable to those of the fully supervised counterparts.
翻译:点云的语义分解往往需要大尺度的附加说明的培训数据,但显然,点点标签太繁琐,无法准备。虽然最近的一些方法提议训练三维网络,但用小百分比点标签来训练三维网络,我们采取极端的方法,并提议“一举一击”,这意味着说明器只需给每个对象贴上一个点。为了在网络培训中利用这些极为稀少的标签,我们设计了一种新的自我培训方法,通过图解传播模块,我们迭接地进行培训和标签传播。此外,我们采用了一个关系网络,生成每类的原型,并明确模拟图形节点之间的相似性,以生成假标签来指导迭代培训。扫描网V2和S3DIS的实验结果显示,我们的自我培训方法,用极粗的注解,优于现有的3D语系分解大边缘的薄弱方法,我们的结果也与完全监督的对应方相似。