Extracting polygonal building footprints from off-nadir imagery is crucial for diverse applications. Current deep-learning-based extraction approaches predominantly rely on semantic segmentation paradigms and post-processing algorithms, limiting their boundary precision and applicability. However, existing polygonal extraction methodologies are inherently designed for near-nadir imagery and fail under the geometric complexities introduced by off-nadir viewing angles. To address these challenges, this paper introduces Polygonal Footprint Network (PolyFootNet), a novel deep-learning framework that directly outputs polygonal building footprints without requiring external post-processing steps. PolyFootNet employs a High-Quality Mask Prompter to generate precise roof masks, which guide polygonal vertex extraction in a unified model pipeline. A key contribution of PolyFootNet is introducing the Self Offset Attention mechanism, grounded in Nadaraya-Watson regression, to effectively mitigate the accuracy discrepancy observed between low-rise and high-rise buildings. This approach allows low-rise building predictions to leverage angular corrections learned from high-rise building offsets, significantly enhancing overall extraction accuracy. Additionally, motivated by the inherent ambiguity of building footprint extraction tasks, we systematically investigate alternative extraction paradigms and demonstrate that a combined approach of building masks and offsets achieves superior polygonal footprint results. Extensive experiments validate PolyFootNet's effectiveness, illustrating its promising potential as a robust, generalizable, and precise polygonal building footprint extraction method from challenging off-nadir imagery. To facilitate further research, we will release pre-trained weights of our offset prediction module at https://github.com/likaiucas/PolyFootNet.
翻译:暂无翻译