We present a method for learning a generative 3D model based on neural radiance fields, trained solely from data with only single views of each object. While generating realistic images is no longer a difficult task, producing the corresponding 3D structure such that they can be rendered from different views is non-trivial. We show that, unlike existing methods, one does not need multi-view data to achieve this goal. Specifically, we show that by reconstructing many images aligned to an approximate canonical pose with a single network conditioned on a shared latent space, you can learn a space of radiance fields that models shape and appearance for a class of objects. We demonstrate this by training models to reconstruct object categories using datasets that contain only one view of each subject without depth or geometry information. Our experiments show that we achieve state-of-the-art results in novel view synthesis and high-quality results for monocular depth prediction.
翻译:我们提出了一个基于神经光亮场的基因3D模型学习方法,该模型仅从仅对每个物体进行单一视图的数据中进行训练。虽然生成现实图像不再是一项困难的任务,但生成相应的3D结构,以便从不同观点中生成这些图像是非三维结构的。我们表明,与现有方法不同,我们不需要多视图数据来实现这一目标。具体地说,我们表明,通过重建许多与近似光亮面相匹配的图像,以共享潜质空间为条件的单一网络,你就可以学习一个光亮场的空间,为一类物体建模和外观。我们通过培训模型来展示这一点,即利用仅包含一个观点的数据集来重建物体类别,该数据集只包含一个观点,而没有深度或几何测量信息。我们的实验显示,我们在新视角合成中取得了最新结果,并且为单层深度预测提供了高质量的高质量结果。