Creating novel views from a single image has achieved tremendous strides with advanced autoregressive models, as unseen regions have to be inferred from the visible scene contents. Although recent methods generate high-quality novel views, synthesizing with only one explicit or implicit 3D geometry has a trade-off between two objectives that we call the "seesaw" problem: 1) preserving reprojected contents and 2) completing realistic out-of-view regions. Also, autoregressive models require a considerable computational cost. In this paper, we propose a single-image view synthesis framework for mitigating the seesaw problem while utilizing an efficient non-autoregressive model. Motivated by the characteristics that explicit methods well preserve reprojected pixels and implicit methods complete realistic out-of-view regions, we introduce a loss function to complement two renderers. Our loss function promotes that explicit features improve the reprojected area of implicit features and implicit features improve the out-of-view area of explicit features. With the proposed architecture and loss function, we can alleviate the seesaw problem, outperforming autoregressive-based state-of-the-art methods and generating an image $\approx$100 times faster. We validate the efficiency and effectiveness of our method with experiments on RealEstate10K and ACID datasets.
翻译:从单一图像中创造新观点取得了巨大的进步,先进的自动递减模型取得了巨大的进步,因为从可见的场景内容中可以推断出看不见的区域。虽然最近的方法产生了高质量的新观点,但以一个明示或隐含的3D几何方法合成的只有一种明示或隐含的3D几何方法,在两个我们称之为“视觉”问题的目标之间有一个权衡:1)保留重新预测的内容和2)完成现实的外观区域。此外,自动递减模型需要相当可观的计算成本。在本文件中,我们提出了一个单一图像合成框架,以缓解视觉问题,同时使用有效的非航空递减模型。我们受明确方法的驱动,这些明确方法保存了重新预测的像素和隐含的方法,完整了现实的外观区域,我们引入了一种损失功能,以补充两个投影器。我们的损失功能促进明确的特征改进了重新预测的隐含特征和隐含的外观领域,改进了清晰特征的外观区域。在拟议的架构和损失功能中,我们可以缓解视觉问题,比基于高效的州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-州-