Voxel-MAE:培训前大型点云的蒙面自动编码器 (Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds)

Mask-based pre-training has achieved great success for self-supervised learning in images and languages without manually annotated supervision. However, it has not yet been studied for large-scale point clouds with redundant spatial information. In this research, we propose a mask voxel autoencoder network for pre-training large-scale point clouds, dubbed Voxel-MAE. Our key idea is to transform the point clouds into voxel representations and classify whether the voxel contains point clouds. This simple but effective strategy makes the network voxel-aware of the object shape, thus improving the performance of downstream tasks, such as 3D object detection. Our Voxel-MAE, with even a 90% masking ratio, can still learn representative features for the high spatial redundancy of large-scale point clouds. We also validate the effectiveness of Voxel-MAE on unsupervised domain adaptative tasks, which proves the generalization ability of Voxel-MAE. Our Voxel-MAE proves that it is feasible to pre-train large-scale point clouds without data annotations to enhance the perception ability of the autonomous vehicle. Extensive experiments show great effectiveness of our pre-training method with 3D object detectors (SECOND, CenterPoint, and PV-RCNN) on three popular datasets (KITTI, Waymo, and nuScenes).

翻译：以面具为基础的培训前,在没有人工附加说明的监管下,在图像和语言的自我监督学习方面取得了巨大成功。然而,还没有对带有冗余空间信息的大型点云进行研究。在这个研究中,我们提议为大规模点云的预培训前培训,设为Voxel-MAE, 使用面具自动自动编码网络。我们的关键想法是将点云转换成 voxel 表示方式,并分类 voxel 是否包含点云。这个简单而有效的战略使得天体形状的网络反oxel-aware, 从而改进了3D 对象探测等下游任务的性能。我们的Voxel-MAE, 使用甚至90%的遮盖率, 仍然可以学习大规模点云层云高度空间冗余的代表性特征。我们还验证了Voxel-MAE 在不受监管的域适应性任务上的有效性,这证明了Voxel-MAE 的概括能力。我们的Voxel-MAE 证明, 在没有数据说明的情况下, 之前的大型点云可以进行前的大型点云, 提高SIS-CS-C-C-C-C-CRD 3号测试, 3S-C-C-C-C-C-C-C-C-C-C-C-Cent-C-C-C-C-C-C-S-S-CAR-S-S-S-S-S-S-S-S-S-S-S-S-S-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-CAR-C-C-CAR-C-C-C-C-C-CAR-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-CAR-CAR-C-C-CAR-C-C-C-C-C-C-C-C-C-C-C-C-C-SAR-S-S-S-C-S-S-CAR-CAR-C-C-C-C-C-C-C-

相关内容

点云

关注 48

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日