Novel-view synthesis (NVS) can be tackled through different approaches, depending on the general setting: a single source image to a short video sequence, exact or noisy camera pose information, 3D-based information such as point clouds etc. The most challenging scenario, the one where we stand in this work, only considers a unique source image to generate a novel one from another viewpoint. However, in such a tricky situation, the latest learning-based solutions often struggle to integrate the camera viewpoint transformation. Indeed, the extrinsic information is often passed as-is, through a low-dimensional vector. It might even occur that such a camera pose, when parametrized as Euler angles, is quantized through a one-hot representation. This vanilla encoding choice prevents the learnt architecture from inferring novel views on a continuous basis (from a camera pose perspective). We claim it exists an elegant way to better encode relative camera pose, by leveraging 3D-related concepts such as the epipolar constraint. We, therefore, introduce an innovative method that encodes the viewpoint transformation as a 2D feature image. Such a camera encoding strategy gives meaningful insights to the network regarding how the camera has moved in space between the two views. By encoding the camera pose information as a finite number of coloured epipolar lines, we demonstrate through our experiments that our strategy outperforms vanilla encoding.
翻译:视一般环境而定,通过不同的方法,可以解决新观点合成(NVS),这取决于一般环境:一个单一来源图像,显示短视频序列,精确或噪音的相机显示信息,基于3D的信息,如点云等。最富有挑战性的情景,我们在此工作中所处的情景,只考虑一种独特的源图像,从另一个角度产生新颖的图像。然而,在这种棘手的情况下,最新的基于学习的解决方案往往难以整合相机的视角转变。事实上,外端信息往往通过一个低维矢量的矢量器以整体形式传递。甚至可能发生这样的相机出现,当像尤勒角度那样的相形色色化时,会通过一热的表示方式进行定量化。这种香草编码选择使得所学的架构无法连续地(从照相机的姿势角度)推断出新的观点。我们声称,在利用3D相关概念,如子极量约束度等来更好地将相对相机的外形形貌进行编码。因此,我们引入了一种创新的方法,将图像转换成2D特征图像图像图像图像图像图像图像的外观,通过摄像头编码战略在我们摄像头的定的变色图像中展示了两个摄像头图像的变形变形变形图像的图像。通过摄像头图像变形变形变形图图图图,通过摄像头的图像变形变形图,我们对网络的图像的图像的图像变形变形变的图像变形变的图像的图像的图像变形战略,我们对网络的变形变形变形图图图图式的变形变形变形变形变形变色变的图像的图像的图像图图图图图图图图图图图图,我们展示了。通过摄像学战略,通过摄像图,通过摄像变影像变色变色变色变色变了。