Semantic scene completion is the task of jointly estimating 3D geometry and semantics of objects and surfaces within a given extent. This is a particularly challenging task on real-world data that is sparse and occluded. We propose a scene segmentation network based on local Deep Implicit Functions as a novel learning-based method for scene completion. Unlike previous work on scene completion, our method produces a continuous scene representation that is not based on voxelization. We encode raw point clouds into a latent space locally and at multiple spatial resolutions. A global scene completion function is subsequently assembled from the localized function patches. We show that this continuous representation is suitable to encode geometric and semantic properties of extensive outdoor scenes without the need for spatial discretization (thus avoiding the trade-off between level of scene detail and the scene extent that can be covered). We train and evaluate our method on semantically annotated LiDAR scans from the Semantic KITTI dataset. Our experiments verify that our method generates a powerful representation that can be decoded into a dense 3D description of a given scene. The performance of our method surpasses the state of the art on the Semantic KITTI Scene Completion Benchmark in terms of geometric completion intersection-over-union (IoU).
翻译:语义场景的完成是共同估计特定范围内天体和表面的3D几何和语义的任务。 这是一项特别具有挑战性的任务, 实际世界数据是稀少的和隐蔽的。 我们提议以本地深隐隐性函数为基础建立一个场景分解网络, 作为一种全新的学习方法, 以完成场景。 与以前在现场完成时的工作不同, 我们的方法产生连续的场景表征, 而不是基于 voxeliz化。 我们将原始点云编码成一个隐蔽的空间, 并用多种空间分辨率。 随后从局部功能补丁中集合出一个全球场景完成功能。 我们显示, 这种连续的表征适合于将大户外场景的几何和语义特性编码, 而不需要空间分解( 从而避免在现场详细程度和可以覆盖的场景范围之间进行权衡 ) 。 我们从语义上讲解的LIDAR扫描方法, 而不是基于Smanti KITTI 数据集进行。 我们的实验证实我们的方法产生强大的表力代表, 能够解成一个密度3D描述给定场景点的深度的三维度描述。 我们的方法的表现超越了ScreaximI在SeximimimimI的完成状态上的进度跨点条件的进度条件条件条件条件超越了SBreabreabilatementalmentalmentmentmentmentalmentalmentaltermex条件。