Existing image-based rendering methods usually adopt depth-based image warping operation to synthesize novel views. In this paper, we reason the essential limitations of the traditional warping operation to be the limited neighborhood and only distance-based interpolation weights. To this end, we propose content-aware warping, which adaptively learns the interpolation weights for pixels of a relatively large neighborhood from their contextual information via a lightweight neural network. Based on this learnable warping module, we propose a new end-to-end learning-based framework for novel view synthesis from a set of input source views, in which two additional modules, namely confidence-based blending and feature-assistant spatial refinement, are naturally proposed to handle the occlusion issue and capture the spatial correlation among pixels of the synthesized view, respectively. Besides, we also propose a weight-smoothness loss term to regularize the network. Experimental results on light field datasets with wide baselines and multi-view datasets show that the proposed method significantly outperforms state-of-the-art methods both quantitatively and visually. The source code will be publicly available at https://github.com/MantangGuo/CW4VS.
翻译:基于图像的现有图像转换方法通常采用基于深度的图像扭曲操作来综合新观点。 在本文中,我们将传统扭曲操作的基本局限性解释为有限的邻里,而只是基于远程的内插权重。 为此,我们建议进行内容辨识扭曲,通过轻量神经网络,通过适应性地从相对大邻里的背景信息中学习相对大邻里像素的内插权重。基于这一可学习的扭曲模块,我们提议了一个新的端到端的基于学习的框架,用于从一组输入源视图进行新的视图合成,其中自然建议另外两个模块,即基于信任的混合和功能辅助空间改进,分别处理闭塞问题和捕捉综合视图像素之间的空间相关性。此外,我们还提议一个重量-脉搏损失术语来规范网络。基于广泛基线和多视图数据集的光场数据集实验结果显示,拟议的方法大大超越了定量和视觉两方面的状态-艺术方法。 源代码将公开在 httpGubgans/ Mangrom 上提供。