Compact and accurate representations of 3D shapes are central to many perception and robotics tasks. State-of-the-art learning-based methods can reconstruct single objects but scale poorly to large datasets. We present a novel recursive implicit representation to efficiently and accurately encode large datasets of complex 3D shapes by recursively traversing an implicit octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%. We also propose an efficient curriculum learning scheme that naturally exploits the coarse-to-fine properties of the underlying octree spatial representation. We explore the scaling law relating latent space dimension, dataset size, and reconstruction accuracy, showing that increasing the latent space dimension is enough to scale to large shape datasets. Finally, we show that our learned latent space encodes a coarse-to-fine hierarchical structure yielding reusable latents across different levels of details, and we provide qualitative evidence of generalization to novel shapes outside the training set.
翻译:3D 形状的缩缩和准确表达是许多感知和机器人任务的核心。 最先进的基于学习的方法可以重建单个对象,但规模小于大数据集。 我们展示了一个新的循环隐含的表达方式,通过在隐蔽空间反复翻转一个隐含的奥氏体来高效和准确地编码复杂的3D形状的大型数据集。 我们隐含的隐含的奥克丽· 自动- Decoder (ROAD) 学习了一种分级结构的潜伏空间, 从而能够以超过99%的压缩率取得最先进的重建结果。 我们还提出了一个高效的课程学习计划, 自然地利用了基础八叶空间代表的粗皮至细性能。 我们探索了与潜伏空间维度、 数据集大小和重建准确性相关的扩展法律, 表明增加潜伏空间的维度足以使大型的数据集变形。 最后, 我们展示了我们学到的深层潜伏空间编码, 一个粗皮至直线的等级结构, 产生不同层次的可再利用的潜伏。 我们还提出了一个质量证据, 向培训组外的新形状提供一般化的定性。