We address the challenge of recovering an underlying scene geometry and colors from a sparse set of RGBD view observations. In this work, we present a new solution that sequentially generates novel RGBD views along a camera trajectory, and the scene geometry is simply the fusion result of these views. More specifically, we maintain an intermediate surface mesh used for rendering new RGBD views, which subsequently becomes complete by an inpainting network; each rendered RGBD view is later back-projected as a partial surface and is supplemented into the intermediate mesh. The use of intermediate mesh and camera projection helps solve the refractory problem of multi-view inconsistency. We practically implement the RGBD inpainting network as a versatile RGBD diffusion model, which is previously used for 2D generative modeling; we make a modification to its reverse diffusion process to enable our use. We evaluate our approach on the task of 3D scene synthesis from sparse RGBD inputs; extensive experiments on the ScanNet dataset demonstrate the superiority of our approach over existing ones. Project page: https://jblei.site/project-pages/rgbd-diffusion.html
翻译:我们从一组稀有的RGBD视图观测中找到一个基本场景几何和颜色的挑战。 在这项工作中,我们提出了一个新的解决方案,按照摄像头轨迹顺序生成新的 RGBD 视图,而场景几何仅仅是这些视图的融合结果。更具体地说,我们保持一个中间表面网格,用于提供新的 RGBD 视图,然后通过一个油漆网络完成;每个提供的 RGBD 视图后来作为部分表面进行后回射,并补充到中间网。使用中间网目和相机投影有助于解决多视图不一致的再相容问题。我们实际上实施了RGBD 喷漆网络,作为多功能RGBD 扩散模型,以前用于2D 基因化模型;我们修改其反向传播过程,以便我们得以使用。我们从稀薄的 RGBD 输入中评估我们关于3D 场景合成任务的方法;关于扫描网数据集的广泛实验表明我们的方法优于现有方法。项目页面: https://jblegiet. page/projproc-page-pages/rgd-ifdstrubd.