Mask-based pre-training has achieved great success for self-supervised learning in image, video, and language, without manually annotated supervision. However, it has not yet been studied about large-scale point clouds with redundant spatial information in autonomous driving. As the number of large-scale point clouds is huge, it is impossible to reconstruct the input point clouds. In this paper, we propose a mask voxel classification network for large-scale point clouds pre-training. Our key idea is to divide the point clouds into voxel representations and classify whether the voxel contains point clouds. This simple strategy makes the network to be voxel-aware of the object shape, thus improving the performance of the downstream tasks, such as 3D object detection. Our Voxel-MAE with even a 90% masking ratio can still learn representative features for the high spatial redundancy of large-scale point clouds. We also validate the effectiveness of Voxel-MAE in unsupervised domain adaptative tasks, which proves the generalization ability of Voxel-MAE. Our Voxel-MAE proves that it is feasible to pre-train large-scale point clouds without data annotations to enhance the perception ability of the autonomous vehicle. Extensive experiments show great effectiveness of our pre-trained model with 3D object detectors (SECOND, CenterPoint, and PV-RCNN) on three popular datasets (KITTI, Waymo, and nuScenes). Codes are publicly available at https://github.com/chaytonmin/Voxel-MAE.
翻译:在图像、视频和语言方面,基于面具的训练前培训在自我监督的学习方面取得了巨大成功,没有人工加注的监管,在图像、视频和语言方面,自我监督的学习也取得了巨大成功。然而,还没有研究关于大型点云和自动驾驶中的冗余空间信息的大规模点云。由于大型点云的数量巨大,因此无法重建输入点云。在本文中,我们提议为大型点云和训练前的大规模点云建立一个遮罩伏克星分类网络。我们的关键想法是将点云分为 voxel 表示,并区分 voxel 是否包含点云。这一简单战略使网络成为对象形状的反毒觉,从而改进下游任务(如3D对象探测)的性能。我们的Voxel-MAE 仍然可以学习大规模点云的具有代表性特征。我们还验证了Voxel-MAE 在非监控域域域的适应性调整任务中,这证明了Voxel-MAE 的通用能力。我们的Voxel-MAE 证明,我们的Voxel-MAE 证明它能够公开地改进下任务,例如 3D-SE-CRODR Cental-CRODR 数据, 3 Sental-dal-dal-ex-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-d-dal-dal-dal-dal-dal-dal-dal-dal-dalgal-dal-dal-dal-dal-d-d-dal-dal-d-d-d-dal-d-dal-d-dal-dal-d-d-d-d-d-d-d-d-d-d-d-d-d-d