Semantic scene understanding from point clouds is particularly challenging as the points reflect only a sparse set of the underlying 3D geometry. Previous works often convert point cloud into regular grids (e.g. voxels or bird-eye view images), and resort to grid-based convolutions for scene understanding. In this work, we introduce RfD-Net that jointly detects and reconstructs dense object surfaces directly from raw point clouds. Instead of representing scenes with regular grids, our method leverages the sparsity of point cloud data and focuses on predicting shapes that are recognized with high objectness. With this design, we decouple the instance reconstruction into global object localization and local shape prediction. It not only eases the difficulty of learning 2-D manifold surfaces from sparse 3D space, the point clouds in each object proposal convey shape details that support implicit function learning to reconstruct any high-resolution surfaces. Our experiments indicate that instance detection and reconstruction present complementary effects, where the shape prediction head shows consistent effects on improving object detection with modern 3D proposal network backbones. The qualitative and quantitative evaluations further demonstrate that our approach consistently outperforms the state-of-the-arts and improves over 11 of mesh IoU in object reconstruction.
翻译:从点云中解析语义场景尤其具有挑战性,因为点点只是反映一组三维基本几何的稀少部分。 先前的工程常常将点云转换成常规网格( 如 voxels 或 bird- eyes 视图图像), 并采用基于网格的变异来进行场景理解。 在这项工作中, 我们引入了RfD- Net, 共同探测并重建直接从原始点云中测得的密集天体表面。 我们的实验表明, 样的探测和重建不是以常规网格代表场景, 而是利用点云数据的广度, 并侧重于预测高目标度识别的形状。 通过这一设计, 我们把实例的重建与全球目标本地化和本地形状预测脱钩。 它不仅减轻了从稀疏远的三维空间中学习二维多元表面的困难, 每个对象提案中的点云会传递形状细节, 支持隐含功能学习重建任何高分辨率表面。 我们的实验表明, 实例的探测和重建将产生互补效应, 。 其形状预测头显示对改进物体探测与现代三维建议网络主干网的物体探测结果产生一致影响。 。 定性和定量和定量评估进一步表明, 我们的天体重建方式改进了目标的状态图图图。