学习使用单视图空中图像生成建筑物屋顶的三维表示 (Learning to Generate 3D Representations of Building Roofs Using Single-View Aerial Imagery)

from arxiv, Copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

We present a novel pipeline for learning the conditional distribution of a building roof mesh given pixels from an aerial image, under the assumption that roof geometry follows a set of regular patterns. Unlike alternative methods that require multiple images of the same object, our approach enables estimating 3D roof meshes using only a single image for predictions. The approach employs the PolyGen, a deep generative transformer architecture for 3D meshes. We apply this model in a new domain and investigate the sensitivity of the image resolution. We propose a novel metric to evaluate the performance of the inferred meshes, and our results show that the model is robust even at lower resolutions, while qualitatively producing realistic representations for out-of-distribution samples.

翻译：我们提出了一种新的管道，用于学习在屋顶几何遵循一组规则模式的假设下，给定来自航空图像的像素的建筑物屋顶网格的条件分布。与需要同一对象的多个图像的替代方法不同，我们的方法仅使用单个图像进行预测，从而实现估计三维屋顶网格。该方法采用了PolyGen，一种用于3D网格的深度生成变换体系结构。我们在一个新领域中应用了这个模型，并研究了图像分辨率的敏感性。我们提出了一种新的指标来评估推断网格的性能，并且我们的结果表明，该模型即使在较低分辨率下仍具有鲁棒性，同时在样本外定量产生逼真的表示。