In this paper, we present an efficient and robust deep learning solution for novel view synthesis of complex scenes. In our approach, a 3D scene is represented as a light field, i.e., a set of rays, each of which has a corresponding color when reaching the image plane. For efficient novel view rendering, we adopt a two-plane parameterization of the light field, where each ray is characterized by a 4D parameter. We then formulate the light field as a 4D function that maps 4D coordinates to corresponding color values. We train a deep fully connected network to optimize this implicit function and memorize the 3D scene. Then, the scene-specific model is used to synthesize novel views. Different from previous light field approaches which require dense view sampling to reliably render novel views, our method can render novel views by sampling rays and querying the color for each ray from the network directly, thus enabling high-quality light field rendering with a sparser set of training images. Per-ray depth can be optionally predicted by the network, thus enabling applications such as auto refocus. Our novel view synthesis results are comparable to the state-of-the-arts, and even superior in some challenging scenes with refraction and reflection. We achieve this while maintaining an interactive frame rate and a small memory footprint.
翻译:在本文中, 我们为复杂场景的新视角合成展示了一个高效和强健的深层次学习解决方案。 在我们的方法中, 3D场景代表了一个光场, 即一组光场, 每一个光场在到达图像平面时都有相应的颜色。 为了高效的新式视图, 我们对光场采用一种两平面参数化, 每个光场都有一个4D参数的特征。 我们然后将光场设计成一个4D函数, 绘制4D 相匹配的颜色值。 我们训练一个深度连接的网络, 以优化这一隐含功能, 并映射 3D 场景。 然后, 将场景特定模型用于合成新的观点。 与以往的光场方法不同, 需要密集的浏览取样才能可靠地提供新的观点。 为了让每个光场看到新的观点, 我们的方法可以提供新颖的观点, 通过取样光谱和查询每个光场的颜色, 直接从网络中绘制出一个更稀少的培训图像组合。 我们的光深度可以被网络选择地预测, 从而能够应用诸如自动重新定位等应用。 我们的新视角合成结果可以与这个更精确的图像, 并保持一个更精确的图像。