We present a novel pipeline for learning the conditional distribution of a building roof mesh given pixels from an aerial image, under the assumption that roof geometry follows a set of regular patterns. Unlike alternative methods that require multiple images of the same object, our approach enables estimating 3D roof meshes using only a single image for predictions. The approach employs the PolyGen, a deep generative transformer architecture for 3D meshes. We apply this model in a new domain and investigate the sensitivity of the image resolution. We propose a novel metric to evaluate the performance of the inferred meshes, and our results show that the model is robust even at lower resolutions, while qualitatively producing realistic representations for out-of-distribution samples.
翻译:我们提出了一种新的管道,用于学习在屋顶几何遵循一组规则模式的假设下,给定来自航空图像的像素的建筑物屋顶网格的条件分布。与需要同一对象的多个图像的替代方法不同,我们的方法仅使用单个图像进行预测,从而实现估计三维屋顶网格。该方法采用了PolyGen,一种用于3D网格的深度生成变换体系结构。我们在一个新领域中应用了这个模型,并研究了图像分辨率的敏感性。我们提出了一种新的指标来评估推断网格的性能,并且我们的结果表明,该模型即使在较低分辨率下仍具有鲁棒性,同时在样本外定量产生逼真的表示。