Neural fields have recently enjoyed great success in representing and rendering 3D scenes. However, most state-of-the-art implicit representations model static or dynamic scenes as a whole, with minor variations. Existing work on learning disentangled world and object neural fields do not consider the problem of composing objects into different world neural fields in a lighting-aware manner. We present Lighting-Aware Neural Field (LANe) for the compositional synthesis of driving scenes in a physically consistent manner. Specifically, we learn a scene representation that disentangles the static background and transient elements into a world-NeRF and class-specific object-NeRFs to allow compositional synthesis of multiple objects in the scene. Furthermore, we explicitly designed both the world and object models to handle lighting variation, which allows us to compose objects into scenes with spatially varying lighting. This is achieved by constructing a light field of the scene and using it in conjunction with a learned shader to modulate the appearance of the object NeRFs. We demonstrate the performance of our model on a synthetic dataset of diverse lighting conditions rendered with the CARLA simulator, as well as a novel real-world dataset of cars collected at different times of the day. Our approach shows that it outperforms state-of-the-art compositional scene synthesis on the challenging dataset setup, via composing object-NeRFs learned from one scene into an entirely different scene whilst still respecting the lighting variations in the novel scene. For more results, please visit our project website https://lane-composition.github.io/.
翻译:近来,神经场在表示和渲染3D场景方面取得了巨大的成功。然而,大多数最先进的隐式表示将静态或动态场景作为一个整体进行建模,只有微小的变化。现有的有关学习解耦世界和对象神经场的研究并未考虑以光照感知方式将不同的对象组合成不同的世界神经场的问题。我们提出了一种面向组合合成驾驶场景的光照感知神经场(LANe)模型,并以物理一致性为前提进行研究。具体而言,我们学习了一种场景表示方法,将静态背景和瞬态元素分解成了世界神经光场和特定类别的对象神经场,以实现多个对象在场景中的组合合成。此外,我们明确设计了世界和对象模型,以处理照明变化,这允许我们将对象组合到具有空间变化照明的场景中。这是通过构建场景的光场,并与所学习的着色器一起使用,以调节对象神经场的外观。我们在使用CARLA模拟器渲染的具有多种照明条件的合成数据集以及新颖的白天不同时段拍摄的汽车真实数据集上展示了我们的模型性能。我们的方法表明,在具有挑战性的数据集设置上,通过将从一个场景学习到的对象神经场组合到完全不同的场景中,同时仍然尊重新场景中的照明变化,我们的方法优于最先进的组合场景合成方法。有关更多结果,请访问我们的项目网站https://lane-composition.github.io/。