We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. Our approach explicitly encodes observations into a volumetric representation that enables amortized rendering. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision.
翻译:我们从由 3D 对象组成的场景的稀有来源观测中研究新颖的视角合成问题。 我们建议一种既非连续又非隐含的简单而有效的方法,对近期的视觉合成趋势提出挑战。 我们的方法明确地将观测编码成一个能进行摊销的体积代表。 我们证明,虽然连续的光亮实地代表由于它们的表达力而得到了很多关注,但我们的简单方法获得了与最先进的基线相比较的可比甚至更好的新颖的重建质量,同时增加了400x。 我们的模型是以类别不可知的方式培训的,不需要对特定场景进行优化。 因此,它能够将新颖的合成归纳为培训中看不到的对象类别。 此外,我们表明,用简单的表述,我们可以将合成视为一种自我监督的信号,用以在没有明确的3D 3D 监督下有效学习 3D 几何 。