We present a learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs. We build on Neural Radiance Fields (NeRF), which uses the weights of a multilayer perceptron to model the density and color of a scene as a function of 3D coordinates. While NeRF works well on images of static subjects captured under controlled settings, it is incapable of modeling many ubiquitous, real-world phenomena in uncontrolled images, such as variable illumination or transient occluders. We introduce a series of extensions to NeRF to address these issues, thereby enabling accurate reconstructions from unstructured image collections taken from the internet. We apply our system, dubbed NeRF-W, to internet photo collections of famous landmarks, and demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art.
翻译:我们提出一种基于学习的方法,将复杂场景的新观点综合起来,仅使用未经结构化的现场照片集;我们以神经辐射场(NERF)为基础,利用多层光谱的重量来模拟场景的密度和颜色,作为3D坐标的函数;虽然NERF对在受控环境中捕捉的静态物体的图像非常有用,但无法在无控制图像中模拟许多无处不在的、真实世界的现象,例如变异的光化或瞬时的浮游。我们为NERF推出了一系列的扩展,以解决这些问题,从而得以从互联网上采集的无结构图像集中进行准确的重建。我们用我们所谓的NERF-W的系统来模拟一个场景的密度和颜色,在互联网上收集著名地标的相片,并展示与以往艺术状态相比非常接近于时间一致的新观点。