Neural scene representations, both continuous and discrete, have recently emerged as a powerful new paradigm for 3D scene understanding. Recent efforts have tackled unsupervised discovery of object-centric neural scene representations. However, the high cost of ray-marching, exacerbated by the fact that each object representation has to be ray-marched separately, leads to insufficiently sampled radiance fields and thus, noisy renderings, poor framerates, and high memory and time complexity during training and rendering. Here, we propose to represent objects in an object-centric, compositional scene representation as light fields. We propose a novel light field compositor module that enables reconstructing the global light field from a set of object-centric light fields. Dubbed Compositional Object Light Fields (COLF), our method enables unsupervised learning of object-centric neural scene representations, state-of-the-art reconstruction and novel view synthesis performance on standard datasets, and rendering and training speeds at orders of magnitude faster than existing 3D approaches.
翻译:连续和离散的神经场景展示最近成为3D场景理解的强大新范例。最近的努力解决了未受监督的发现以物体为中心的神经场场展示的问题。然而,光照全局成本高昂,由于每个物体展示必须分开对射线进行分划,导致光谱场采样不足,因此,在培训和展示期间,光谱、框架率低、记忆和时间复杂性高。在这里,我们提议将物体作为光场在以物体为中心的组成场景展示中代表。我们提出了一个新的光场成像仪模块,以便能够从一组以物体为中心的光场重建全球光场。紫形成形的物体光场(COLF),我们的方法使得能够不受监督地学习以物体为中心的神经场展示,在标准数据集上进行最先进的重建和新颖的合成性表演,以及以比现有的3D方法更快的速度在数量上制作和培训速度。