Creating novel views from a single image has achieved tremendous strides with advanced autoregressive models. Although recent methods generate high-quality novel views, synthesizing with only one explicit or implicit 3D geometry has a trade-off between two objectives that we call the ``seesaw'' problem: 1) preserving reprojected contents and 2) completing realistic out-of-view regions. Also, autoregressive models require a considerable computational cost. In this paper, we propose a single-image view synthesis framework for mitigating the seesaw problem. The proposed model is an efficient non-autoregressive model with implicit and explicit renderers. Motivated by characteristics that explicit methods well preserve reprojected pixels and implicit methods complete realistic out-of-view region, we introduce a loss function to complement two renderers. Our loss function promotes that explicit features improve the reprojected area of implicit features and implicit features improve the out-of-view area of explicit features. With the proposed architecture and loss function, we can alleviate the seesaw problem, outperforming autoregressive-based state-of-the-art methods and generating an image $\approx$100 times faster. We validate the efficiency and effectiveness of our method with experiments on RealEstate10K and ACID datasets.
翻译:从单一图像中创造新观点取得了长足的进步,采用了先进的自动递减模型。虽然最近的方法产生了高质量的新观点,但与仅有一种明示或隐含的3D几何学制合成了高质量的新观点,在两个我们称之为“Seesaw”问题的两个目标之间产生了权衡:(1) 保留重新预测的内容和(2) 完成现实的外观区域。此外,自动递减模型需要相当的计算成本。在本文件中,我们提议了一个单一图像综合框架,以缓解锯齿问题。拟议的模型是一个高效的、隐含和显性投影者的非投影性模型。我们受以下特点的驱动:明确方法保存了重新预测的像素和隐含的方法,并且完整了现实的外观区域。我们引入了一种损失功能来补充两个投影者。我们的损失功能促进明确的特征改进了重新预测的隐含特征和隐含特征的领域,从而改进了显性特征的外观领域。通过拟议的架构和损失功能,我们可以缓解视觉问题,超越了以隐含和显性投影器为基础的自动递减状态模式。我们用了一个快速的A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-MA-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A