We propose an unsupervised method for parsing large 3D scans of real-world scenes into interpretable parts. Our goal is to provide a practical tool for analyzing 3D scenes with unique characteristics in the context of aerial surveying and mapping, without relying on application-specific user annotations. Our approach is based on a probabilistic reconstruction model that decomposes an input 3D point cloud into a small set of learned prototypical shapes. Our model provides an interpretable reconstruction of complex scenes and leads to relevant instance and semantic segmentations. To demonstrate the usefulness of our results, we introduce a novel dataset of seven diverse aerial LiDAR scans. We show that our method outperforms state-of-the-art unsupervised methods in terms of decomposition accuracy while remaining visually interpretable. Our method offers significant advantage over existing approaches, as it does not require any manual annotations, making it a practical and efficient tool for 3D scene analysis. Our code and dataset are available at https://imagine.enpc.fr/~loiseaur/learnable-earth-parser
翻译:我们提出了一种无监督的方法,用于将真实世界场景的大型三维扫描解析为可解释的部分。我们的目标是提供一种在航空勘测和制图的背景下分析具有独特特征的三维场景的实用工具,而不依赖于特定应用的用户注释。我们的方法基于一个概率重建模型,将输入的三维点云分解成一小组学习到的样本形状。我们的模型提供了复杂场景的可解释重建,并导致相关实例和语义分割。为了展示我们结果的有用性,我们引入了一个全新的七个不同的航空LiDAR扫描的数据集。我们证明了我们的方法在分解准确性方面优于最先进的无监督方法,同时保持视觉的可解释性。我们的方法比现有方法具有重要优势,因为它不需要任何手动注释,从而成为一种实用和高效的三维场景分析工具。我们的代码和数据集可在 https://imagine.enpc.fr/~loiseaur/learnable-earth-parser 上找到。