We describe a simple pre-training approach for point clouds. It works in three steps: 1. Mask all points occluded in a camera view; 2. Learn an encoder-decoder model to reconstruct the occluded points; 3. Use the encoder weights as initialisation for downstream point cloud tasks. We find that even when we construct a single pre-training dataset (from ModelNet40), this pre-training method improves accuracy across different datasets and encoders, on a wide range of downstream tasks. Specifically, we show that our method outperforms previous pre-training methods in object classification, and both part-based and semantic segmentation tasks. We study the pre-trained features and find that they lead to wide downstream minima, have high transformation invariance, and have activations that are highly correlated with part labels. Code and data are available at: https://github.com/hansen7/OcCo
翻译:我们描述对点云的简单培训前方法。 它分为三个步骤: 1. 将所有点都遮盖在摄像视图中; 2. 学习一个编码器解码器模型来重建隐蔽点; 3. 使用编码器加权作为下游点云任务的初始化。 我们发现,即使我们建造了一个单一的培训前数据集(来自模型Net40),这种培训前方法也提高了不同数据集和编码器的准确性,涉及广泛的下游任务。 具体地说,我们显示我们的方法在物体分类以及部分和语义分割任务方面都比以前的训练前方法要好。 我们研究了预先训练的特征,发现它们导致大下游的微型,具有高度的变异性,并具有与部分标签高度关联的激活功能。 代码和数据见: https://github.com/hansen7/OcCo: https://github. com/hansen7/Oco。