We introduce FaDIV-Syn, a fast depth-independent method for novel view synthesis. Related methods are often limited by their depth estimation stage, where incorrect depth predictions can lead to large projection errors. To avoid this issue, we efficiently warp input images into the target frame for a range of assumed depth planes. The resulting plane sweep volume (PSV) is directly fed into our network, which first estimates soft PSV masks in a self-supervised manner, and then directly produces the novel output view. We therefore side-step explicit depth estimation. This improves efficiency and performance on transparent, reflective, thin, and feature-less scene parts. FaDIV-Syn can perform both interpolation and extrapolation tasks and outperforms state-of-the-art extrapolation methods on the large-scale RealEstate10k dataset. In contrast to comparable methods, it achieves real-time performance due to its lightweight architecture. We thoroughly evaluate ablations, such as removing the Soft-Masking network, training from fewer examples as well as generalization to higher resolutions and stronger depth discretization.
翻译:我们引入了FaDIV-Syn, 这是一种快速的离深度独立的新观点合成方法。 相关方法往往因其深度估计阶段而受到限制, 其深度预测不正确可能导致大投影错误。 为避免这一问题, 我们将大量假设深度平面的目标框中输入图像。 由此产生的空扫量( PSV) 直接输入我们的网络, 它首先以自我监督的方式估算软 PSV 面罩, 然后直接生成新的输出视图。 因此, 我们从侧面进行明确的深度估计。 这提高了透明、 反射、 薄度和无特征的场景部分的效率和性能。 FADIV- Syn 既可以执行内推和外推任务,也可以在大规模RealEstate10 数据集上形成最先进的外推法。 与类似的方法相比, 它实现了轻度结构的实时性能。 我们彻底评估了节流, 例如删除软式测深深深度的网络, 培训从更少的例子到更高的分辨率和深度分解。