Most urban applications necessitate building footprints in the form of concise vector graphics with sharp boundaries rather than pixel-wise raster images. This need contrasts with the majority of existing methods, which typically generate over-smoothed footprint polygons. Editing these automatically produced polygons can be inefficient, if not more time-consuming than manual digitization. This paper introduces a semi-automatic approach for building footprint extraction through semantically-sensitive superpixels and neural graph networks. Drawing inspiration from object-based classification techniques, we first learn to generate superpixels that are not only boundary-preserving but also semantically-sensitive. The superpixels respond exclusively to building boundaries rather than other natural objects, while simultaneously producing semantic segmentation of the buildings. These intermediate superpixel representations can be naturally considered as nodes within a graph. Consequently, graph neural networks are employed to model the global interactions among all superpixels and enhance the representativeness of node features for building segmentation. Classical approaches are utilized to extract and regularize boundaries for the vectorized building footprints. Utilizing minimal clicks and straightforward strokes, we efficiently accomplish accurate segmentation outcomes, eliminating the necessity for editing polygon vertices. Our proposed approach demonstrates superior precision and efficacy, as validated by experimental assessments on various public benchmark datasets. We observe a 10\% enhancement in the metric for superpixel clustering and an 8\% increment in vector graphics evaluation, when compared with established techniques. Additionally, we have devised an optimized and sophisticated pipeline for interactive editing, poised to further augment the overall quality of the results.
翻译:大多数城市应用需要建筑轮廓以紧凑的矢量图形表示,具有清晰的边界而不是像素级光栅图像。这一需求与现有方法的大部分相反,这些方法通常生成过度平滑的轮廓多边形。编辑这些自动化生成的多边形可能效率低下,甚至比手动数字化更耗费时间。本文通过语义敏感的超像素和神经图网络介绍了一种半自动化的建筑轮廓提取方法。我们首先借鉴基于对象的分类技术,学习生成不仅能保留边界而且是语义敏感的超像素。超像素仅响应于建筑物边界而不是其他自然物体,同时产生建筑物的语义分割。这些中间超像素表示可以自然地视为图中的节点。因此,采用图神经网络来模拟所有超像素之间的全局交互,并增强节点特征的代表性,以进行建筑物分割。利用最小的点击和简单的笔画,我们可以高效地实现准确的分割结果,消除了编辑多边形顶点的必要性。我们提出的方法表现出卓越的精度和效率,并通过各种公共基准数据集的实验评估进行了验证。与已有技术相比,我们观察到超像素聚类指标有10%的增长,矢量图形评估增长了8%。此外,我们还设计了一个经过优化和复杂化的交互式编辑流水线,旨在进一步提高结果的整体质量。