Indoor scenes typically exhibit complex, spatially-varying appearance from global illumination, making inverse rendering a challenging ill-posed problem. This work presents an end-to-end, learning-based inverse rendering framework incorporating differentiable Monte Carlo raytracing with importance sampling. The framework takes a single image as input to jointly recover the underlying geometry, spatially-varying lighting, and photorealistic materials. Specifically, we introduce a physically-based differentiable rendering layer with screen-space ray tracing, resulting in more realistic specular reflections that match the input photo. In addition, we create a large-scale, photorealistic indoor scene dataset with significantly richer details like complex furniture and dedicated decorations. Further, we design a novel out-of-view lighting network with uncertainty-aware refinement leveraging hypernetwork-based neural radiance fields to predict lighting outside the view of the input photo. Through extensive evaluations on common benchmark datasets, we demonstrate superior inverse rendering quality of our method compared to state-of-the-art baselines, enabling various applications such as complex object insertion and material editing with high fidelity. Code and data will be made available at \url{https://jingsenzhu.github.io/invrend}.
翻译:室内景色通常呈现出复杂的、空间变化式的、来自全球光化的外观外观外观外观外观外观外观外观外观外观外观外观外观外观外观外观外观外观外观上的重要取样。 具体地说,我们引入了一个基于物理的、有屏幕-空间射线追踪功能的、可视化的不同成像层,导致更现实的视觉反射,与输入照片相匹配。 此外,我们创建了一个大型、基于学习的、基于学习的反向框架,包含不同内容的蒙特卡洛射线和重要抽样抽样抽样取样。 此外,我们设计了一个具有不确定性的外观外观光网络,利用基于超网络的神经光谱场,在输入照片外观外对照明进行预测。 通过对通用基准数据集的广泛评价,我们展示了我们方法的质量与最新基线相匹配的优异性直观反射镜。 此外,我们创建了一个大型、具有照片真实性的室内场景场景数据集,使各种应用软件更加丰富,例如复杂的物体插入和材料编辑/高诚实度数据。 代码和数据将予提供。