View synthesis aims to generate novel views from one or more given source views. Although existing methods have achieved promising performance, they usually require paired views of different poses to learn a pixel transformation. This paper proposes an unsupervised network to learn such a pixel transformation from a single source viewpoint. In particular, the network consists of a token transformation module (TTM) that facilities the transformation of the features extracted from a source viewpoint image into an intrinsic representation with respect to a pre-defined reference pose and a view generation module (VGM) that synthesizes an arbitrary view from the representation. The learned transformation allows us to synthesize a novel view from any single source viewpoint image of unknown pose. Experiments on the widely used view synthesis datasets have demonstrated that the proposed network is able to produce comparable results to the state-of-the-art methods despite the fact that learning is unsupervised and only a single source viewpoint image is required for generating a novel view. The code will be available soon.
翻译:查看合成的目的是从一种或多种源视图中产生新观点。 虽然现有方法已经取得了有希望的性能, 但通常需要对不同面形的对等观点来学习像素变异。 本文建议建立一个不受监督的网络, 从单一源的角度学习这种像素变异。 特别是, 网络包含一个象征性变异模块( TTM ), 该模块将从源视图图像中提取的特征转换成一个内在的表达方式, 与预设的参考图像和组合表达式任意视图的生成模块( VGM ) 有关。 学习的转换使我们能够从任何单一源视图中合成一个未知面貌的新观点。 对广泛使用的合成数据集的实验显示, 拟议的网络能够产生与最新方法相似的结果, 尽管学习是不受监督的, 而生成新视图只需要单一源视图。 代码将很快可用 。