Synthesizing novel views from a single view image is a highly ill-posed problem. We discover an effective solution to reduce the learning ambiguity by expanding the single-view view synthesis problem to a multi-view setting. Specifically, we leverage the reliable and explicit stereo prior to generate a pseudo-stereo viewpoint, which serves as an auxiliary input to construct the 3D space. In this way, the challenging novel view synthesis process is decoupled into two simpler problems of stereo synthesis and 3D reconstruction. In order to synthesize a structurally correct and detail-preserved stereo image, we propose a self-rectified stereo synthesis to amend erroneous regions in an identify-rectify manner. Hard-to-train and incorrect warping samples are first discovered by two strategies, 1) pruning the network to reveal low-confident predictions; and 2) bidirectionally matching between stereo images to allow the discovery of improper mapping. These regions are then inpainted to form the final pseudo-stereo. With the aid of this extra input, a preferable 3D reconstruction can be easily obtained, and our method can work with arbitrary 3D representations. Extensive experiments show that our method outperforms state-of-the-art single-view view synthesis methods and stereo synthesis methods.
翻译:从单视角图像综合生成新视角是一个高度不适定的问题。本文提出一种有效的解决方案,将单视角视图综合问题扩展到多视角设置。具体而言,我们利用可靠的显式立体先验生成伪立体视点,它作为辅助输入构建三维空间。通过这种方式,具有挑战性的新视角综合过程分解成两个更简单的问题:立体综合和三维重构。为了综合出结构正确、细节保留的立体图像,我们提出了一种自校正的立体综合方法,以“识别—校正”方式修复错误区域。通过两种策略,即通过剪枝网络揭示低置信度预测和双向匹配立体图像来发现难以训练和不正确的变形样本,我们可以发现难以训练和不正确的变形样本。这些区域随后被填充以形成最终的伪立体图像。借助这个额外的输入,能够很容易地获得更好的三维重构结果,我们的方法可以适用于任意三维表示。大量实验证明,我们的方法优于现有的单视角图像综合方法和立体图像综合方法。