We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics. Our goal is to make the dataset creation process widely accessible, transforming scans into photorealistic datasets with high-quality ground truth for appearance, layout, semantic labels, high quality spatially-varying BRDF and complex lighting, including direct, indirect and visibility components. This enables important applications in inverse rendering, scene understanding and robotics. We show that deep networks trained on the proposed dataset achieve competitive performance for shape, material and lighting estimation on real images, enabling photorealistic augmented reality applications, such as object insertion and material editing. We also show our semantic labels may be used for segmentation and multi-task learning. Finally, we demonstrate that our framework may also be integrated with physics engines, to create virtual robotics environments with unique ground truth such as friction coefficients and correspondence to real scenes. The dataset and all the tools to create such datasets will be made publicly available.
翻译:我们提出一个新框架,用于创建大型室内场景光现实数据集,包括地面真象几何、材料、照明和语义学。我们的目标是使数据集创建过程能够广泛使用,将扫描转化为具有高质量地面真象的光现实化数据集,用于外观、布局、语义标签、高质量的空间分布式BRDF和复杂的照明,包括直接、间接和可见的组件。这样就可以在反演、场景理解和机器人方面进行重要的应用。我们显示,在拟议数据集方面受过培训的深层网络能够对真实图像的形状、材料和照明进行有竞争力的估测,使光真实图像能够扩大现实应用,例如物体插入和材料编辑。我们还展示了我们的语义标签可用于分割和多任务学习。最后,我们证明我们的框架也可以与物理引擎相结合,以创造具有独特地面真象的虚拟机器人环境,如摩擦系数和对真实场面的对等。数据集和创建这类数据集的所有工具都将公开提供。