Multi-view projection methods have demonstrated promising performance on 3D understanding tasks like 3D classification and segmentation. However, it remains unclear how to combine such multi-view methods with the widely available 3D point clouds. Previous methods use unlearned heuristics to combine features at the point level. To this end, we introduce the concept of the multi-view point cloud (Voint cloud), representing each 3D point as a set of features extracted from several view-points. This novel 3D Voint cloud representation combines the compactness of 3D point cloud representation with the natural view-awareness of multi-view representation. Naturally, we can equip this new representation with convolutional and pooling operations. We deploy a Voint neural network (VointNet) with a theoretically established functional form to learn representations in the Voint space. Our novel representation achieves state-of-the-art performance on 3D classification and retrieval on ScanObjectNN, ModelNet40, and ShapeNet Core55. Additionally, we achieve competitive performance for 3D semantic segmentation on ShapeNet Parts. Further analysis shows that VointNet improves the robustness to rotation and occlusion compared to other methods.
翻译:多视角预测方法显示,在3D分类和分解等3D理解任务上,取得了有希望的绩效。然而,仍然不清楚如何将这种多视角方法与广泛存在的3D点云层结合起来。以前的方法使用不学的超光速来结合点层的特征。为此,我们引入了多视图点云(Voint云)的概念,代表了从几个视图点中提取的三维点的一组特征。这个新奇的 3D Voint 云的表示方式结合了 3D 点云的缩压和多视图代表的自然视觉意识。自然,我们可以用动态和集合操作来装备这种新的表达方式。我们部署了一个带有理论性固定功能形式的Voint 神经网络(VointNet) 来学习Voint 空间的表达方式。我们的新表达方式在 ScanObjectN、 ModelNet40 和 ShapeNet Core55 上实现了三维点云代表的紧凑性表现。此外,我们为 ShapeNet Parts的3D mantictroduction 取得竞争性表现。进一步分析显示, VointNet 将更稳性地与对比其他封闭。