Voxel-MAE:培训前大型点云的蒙面自动编码器 (Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds)

Mask-based pre-training has achieved great success for self-supervised learning in image, video, and language, without manually annotated supervision. However, it has not yet been studied about large-scale point clouds with redundant spatial information in autonomous driving. As the number of large-scale point clouds is huge, it is impossible to reconstruct the input point clouds. In this paper, we propose a mask voxel classification network for large-scale point clouds pre-training. Our key idea is to divide the point clouds into voxel representations and classify whether the voxel contains point clouds. This simple strategy makes the network to be voxel-aware of the object shape, thus improving the performance of the downstream tasks, such as 3D object detection. Our Voxel-MAE with even a 90% masking ratio can still learn representative features for the high spatial redundancy of large-scale point clouds. We also validate the effectiveness of Voxel-MAE in unsupervised domain adaptative tasks, which proves the generalization ability of Voxel-MAE. Our Voxel-MAE proves that it is feasible to pre-train large-scale point clouds without data annotations to enhance the perception ability of the autonomous vehicle. Extensive experiments show great effectiveness of our pre-trained model with 3D object detectors (SECOND, CenterPoint, and PV-RCNN) on three popular datasets (KITTI, Waymo, and nuScenes). Codes are publicly available at https://github.com/chaytonmin/Voxel-MAE.

翻译：在图像、视频和语言方面,基于面具的训练前培训在自我监督的学习方面取得了巨大成功,没有人工加注的监管,在图像、视频和语言方面,自我监督的学习也取得了巨大成功。然而,还没有研究关于大型点云和自动驾驶中的冗余空间信息的大规模点云。由于大型点云的数量巨大,因此无法重建输入点云。在本文中,我们提议为大型点云和训练前的大规模点云建立一个遮罩伏克星分类网络。我们的关键想法是将点云分为 voxel 表示,并区分 voxel 是否包含点云。这一简单战略使网络成为对象形状的反毒觉,从而改进下游任务(如3D对象探测)的性能。我们的Voxel-MAE 仍然可以学习大规模点云的具有代表性特征。我们还验证了Voxel-MAE 在非监控域域域的适应性调整任务中,这证明了Voxel-MAE 的通用能力。我们的Voxel-MAE 证明,我们的Voxel-MAE 证明它能够公开地改进下任务,例如 3D-SE-CRODR Cental-CRODR 数据, 3 Sental-dal-dal-ex-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-d-dal-dal-dal-dal-dal-dal-dal-dal-dalgal-dal-dal-dal-dal-d-d-dal-dal-d-d-d-dal-d-dal-d-dal-dal-d-d-d-d-d-d-d-d-d-d-d-d-d-d

相关内容

点云

关注 49

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日