Densely annotating LiDAR point clouds is costly, which restrains the scalability of fully-supervised learning methods. In this work, we study the underexplored semi-supervised learning (SSL) in LiDAR segmentation. Our core idea is to leverage the strong spatial cues of LiDAR point clouds to better exploit unlabeled data. We propose LaserMix to mix laser beams from different LiDAR scans, and then encourage the model to make consistent and confident predictions before and after mixing. Our framework has three appealing properties: 1) Generic: LaserMix is agnostic to LiDAR representations (e.g., range view and voxel), and hence our SSL framework can be universally applied. 2) Statistically grounded: We provide a detailed analysis to theoretically explain the applicability of the proposed framework. 3) Effective: Comprehensive experimental analysis on popular LiDAR segmentation datasets (nuScenes, SemanticKITTI, and ScribbleKITTI) demonstrates our effectiveness and superiority. Notably, we achieve competitive results over fully-supervised counterparts with 2x to 5x fewer labels and improve the supervised-only baseline significantly by 10.8% on average. We hope this concise yet high-performing framework could facilitate future research in semi-supervised LiDAR segmentation. Code is publicly available.
翻译:密集注释 LiDAR 点云成本高昂,这限制了完全监督学习方法的可扩展性。在此工作中,我们研究了 LiDAR 分割中未被充分探索的半监督学习 (SSL)。我们的核心思想是利用 LiDAR 点云的强空间线索来更好地利用未标记的数据。我们提出了 LaserMix 来混合来自不同 LiDAR 扫描的激光束,然后鼓励模型在混合之前和之后做出一致且自信的预测。我们的框架具有三个吸引人的特点: 1) 通用性: LaserMix 不受 LiDAR 表示 (例如,范围视图和体素) 的影响,因此我们的 SSL 框架可以普遍应用。 2) 统计基础: 我们提供了详细的分析,理论上解释了所提议框架的适用性。 3) 有效性: 在流行的 LiDAR 分割数据集 (nuScenes、SemanticKITTI 和 ScribbleKITTI) 上进行全面的实验分析,证明了我们的有效性和优越性。值得注意的是,我们在全监督对应物上取得了具有竞争力的结果,并且比只有监督的基线(平均提高 10.8%)有了显著的改进。我们希望这个简洁而高效的框架能够促进未来半监督 LiDAR 分割的研究。代码已公开。