A unified neural network structure is presented for joint 3D object detection and point cloud segmentation in this paper. We leverage rich supervision from both detection and segmentation labels rather than using just one of them. In addition, an extension based on single-stage object detectors is proposed based on the implicit function widely used in 3D scene and object understanding. The extension branch takes the final feature map from the object detection module as input, and produces an implicit function that generates semantic distribution for each point for its corresponding voxel center. We demonstrated the performance of our structure on nuScenes-lidarseg, a large-scale outdoor dataset. Our solution achieves competitive results against state-of-the-art methods in both 3D object detection and point cloud segmentation with little additional computation load compared with object detection solutions. The capability of efficient weakly supervision semantic segmentation of the proposed method is also validated by experiments.
 翻译:本文为联合 3D 对象探测和点云分解提供了一个统一的神经网络结构。 我们利用探测和分解标签上的丰富的监督,而不是仅仅使用其中的一个。 此外, 根据在 3D 场景和对象理解中广泛使用的隐含功能, 提议了基于单级物体探测器的扩展。 扩展分支将物体探测模块的最后特征地图作为输入, 并产生一个隐含功能, 为其对应的 voxel 中心生成每个点的语义分布。 我们展示了我们在大型室外数据集 nuScenes-lidarseg 上的结构的性能。 我们的解决方案在 3D 对象探测和点云分解中都取得了与最先进的方法的竞争性效果, 与对象探测解决方案相比, 几乎没有额外的计算负荷。 高效的监控拟议方法的语义分解能力也得到了实验的验证。