We introduce a novel deep learning-based framework to interpret 3D urban scenes represented as textured meshes. Based on the observation that object boundaries typically align with the boundaries of planar regions, our framework achieves semantic segmentation in two steps: planarity-sensible over-segmentation followed by semantic classification. The over-segmentation step generates an initial set of mesh segments that capture the planar and non-planar regions of urban scenes. In the subsequent classification step, we construct a graph that encodes the geometric and photometric features of the segments in its nodes and the multi-scale contextual features in its edges. The final semantic segmentation is obtained by classifying the segments using a graph convolutional network. Experiments and comparisons on two semantic urban mesh benchmarks demonstrate that our approach outperforms the state-of-the-art methods in terms of boundary quality, mean IoU (intersection over union), and generalization ability. We also introduce several new metrics for evaluating mesh over-segmentation methods dedicated to semantic segmentation, and our proposed over-segmentation approach outperforms state-of-the-art methods on all metrics. Our source code is available at \url{https://github.com/WeixiaoGao/PSSNet}.
翻译:我们引入了一个新的深层次学习框架来解释3D城市景象,这些景象以纹理 meshes为代表。基于物体边界通常与平板区域边界相一致的观察,我们的框架分两个步骤实现语义分解:平准感知超分层,然后是语义分类。过分分层的步骤产生了一组初步的网状区块,以捕捉城市景象的平面区域和非平面区域。在随后的分类步骤中,我们构建了一个图表,将各部分在其节点和边缘多尺度背景特征中的几何和光度特征编码起来。最后的语义分解是通过使用图解变动网络对各段进行分类而取得的。两个语义化城市网状基准的实验和比较表明,我们的方法在边界质量、平均的IOU(交错处)和一般化能力方面超越了最新的方法。我们还引入了几个用于评价内分层超分解方法和多尺度背景背景背景环境特征的图象方法。我们提出的所有建定式矩阵方法都是用于静态分解法的源。