通过等级和可学习的部位分割 3D 片段 (Robust 3D Scene Segmentation through Hierarchical and Learnable Part-Fusion)

3D semantic segmentation is a fundamental building block for several scene understanding applications such as autonomous driving, robotics and AR/VR. Several state-of-the-art semantic segmentation models suffer from the part misclassification problem, wherein parts of the same object are labelled incorrectly. Previous methods have utilized hierarchical, iterative methods to fuse semantic and instance information, but they lack learnability in context fusion, and are computationally complex and heuristic driven. This paper presents Segment-Fusion, a novel attention-based method for hierarchical fusion of semantic and instance information to address the part misclassifications. The presented method includes a graph segmentation algorithm for grouping points into segments that pools point-wise features into segment-wise features, a learnable attention-based network to fuse these segments based on their semantic and instance features, and followed by a simple yet effective connected component labelling algorithm to convert segment features to instance labels. Segment-Fusion can be flexibly employed with any network architecture for semantic/instance segmentation. It improves the qualitative and quantitative performance of several semantic segmentation backbones by upto 5% when evaluated on the ScanNet and S3DIS datasets.

翻译：3D 语义分解是一些场景理解应用,如自主驱动、机器人和AR/VR等应用、机器人和AR/VR等的基本构件。几个最先进的语义分解模型存在部分分类错误问题,其中同一对象的部分被贴上错误标签。以往的方法曾使用等级、迭代方法将语义和实例信息结合起来, 但它们在背景融合中缺乏可学习性, 并且是计算复杂和超常驱动的。本文展示了片段- Fusion, 这是一种基于关注的新方法, 用于将语义和实例信息进行等级整合, 以解决部分分类错误。提出的方法包括将点性特征汇集到部分特征的各部分的图形分解算法, 一种基于可学习的注意网络, 以根据语义和实例特征将这些部分结合在一起, 并随后采用简单而有效的连接的标签算法, 将段段特性转换成实例标签。段- Fusion可以灵活地用于任何网络结构结构结构, 以解决/ 分解分化。当通过 Screaltotototo 评估若干语系断段和定量和定量数据断块的Smantistretment3 。