While current multi-frame restoration methods combine information from multiple input images using 2D alignment techniques, recent advances in novel view synthesis are paving the way for a new paradigm relying on volumetric scene representations. In this work, we introduce the first 3D-based multi-frame denoising method that significantly outperforms its 2D-based counterparts with lower computational requirements. Our method extends the multiplane image (MPI) framework for novel view synthesis by introducing a learnable encoder-renderer pair manipulating multiplane representations in feature space. The encoder fuses information across views and operates in a depth-wise manner while the renderer fuses information across depths and operates in a view-wise manner. The two modules are trained end-to-end and learn to separate depths in an unsupervised way, giving rise to Multiplane Feature (MPF) representations. Experiments on the Spaces and Real Forward-Facing datasets as well as on raw burst data validate our approach for view synthesis, multi-frame denoising, and view synthesis under noisy conditions.
翻译:目前的多帧恢复方法将来自多个输入图像的信息使用二维对齐技术组合起来,而最近新视角合成的进展为基于体积场景表示的新范例铺平了道路。在这项工作中,我们引入了第一个基于三维多帧去噪的方法,其计算要求更低,显著优于其二维对应物。我们的方法将多平面图像(MPI)框架用于新视角合成,通过引入一个可学习的编码器-渲染器对在特征空间操作多平面表示。编码器在深度上融合视图信息,以深度为单位进行操作,而渲染器在视觉上融合深度信息,以视图为单位进行操作。两个模块进行端到端训练,并学会以无监督的方式分离深度,产生了多平面特征(MPF)表示。在Spaces和Real Forward-Facing数据集以及原始短串数据上的实验证实了我们的方法在视角合成,多帧去噪和噪声条件下视角合成的有效性。