Voxel-MAE:培训前大型点云的蒙面自动编码器 (Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds)

Mask-based pre-training has achieved great success for self-supervised learning in image, video and language, without manually annotated supervision. However, as information redundant data, it has not yet been studied in the field of 3D object detection. As the point clouds in 3D object detection is large-scale, it is impossible to reconstruct the input point clouds. In this paper, we propose a mask voxel classification network for large-scale point clouds pre-training. Our key idea is to divide the point clouds into voxel representations and classify whether the voxel contains point clouds. This simple strategy makes the network to be voxel-aware of the object shape, thus improving the performance of 3D object detection. Extensive experiments show great effectiveness of our pre-trained model with 3D object detectors (SECOND, CenterPoint, and PV-RCNN) on three popular datasets (KITTI, Waymo, and nuScenes). Codes are publicly available at https: //github.com/chaytonmin/Voxel-MAE.

翻译：以面具为基础的培训前,在没有人工附加说明的监督下,在图像、视频和语言的自我监督学习方面取得了巨大成功。但是,由于信息冗余数据,尚未在3D物体探测领域对其进行研究。由于3D物体探测中的点云是大规模,因此不可能重建输入点云。在本文中,我们提议为大型点云训练前的大规模云层建立一个蒙面Voxel分类网络。我们的主要想法是将点云分为 voxel 表示方式,并对 voxel 是否包含点云进行分类。这一简单战略使网络成为3D物体形状的 voxel-aware,从而改进了3D物体探测的性能。广泛的实验显示我们预先训练的3D物体探测器模型(SECOND、CentPoint和PV-RCNNN)在三种流行数据集(KITTI、Waymo和nuScenes)上非常有效(KITTI、Waymo和PV-RCNNN)。代码可在https:/githhuthub.com/chaytonmin/Voxel-MAE)上公开查阅。

相关内容

点云

关注 49

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日