Existing neural human rendering methods struggle with a single image input due to the lack of information in invisible areas and the depth ambiguity of pixels in visible areas. In this regard, we propose Monocular Neural Human Renderer (MonoNHR), a novel approach that renders robust free-viewpoint images of an arbitrary human given only a single image. MonoNHR is the first method that (i) renders human subjects never seen during training in a monocular setup, and (ii) is trained in a weakly-supervised manner without geometry supervision. First, we propose to disentangle 3D geometry and texture features and to condition the texture inference on the 3D geometry features. Second, we introduce a Mesh Inpainter module that inpaints the occluded parts exploiting human structural priors such as symmetry. Experiments on ZJU-MoCap, AIST, and HUMBI datasets show that our approach significantly outperforms the recent methods adapted to the monocular case.
翻译:现有神经人造方法与单一图像输入挣扎,原因是无形区域缺乏信息,可见区域像素的深度模糊。 在这方面,我们提议采用独立神经人类导体(MonoNHR)这一新颖方法,使任意人类的无线图像成为仅给定一个图像的强力自由视点图像。 MonoNHR是第一个方法,它(一) 在单层结构的培训中,使人类的主体成为从未见过的,以及(二) 在没有几何监督的情况下,以薄弱的监督方式接受培训。首先,我们提议分离 3D 几何和纹理特征,并对 3D 几何特征的纹理推断进行条件。 其次,我们引入一个Mesh Inpainter 模块,该模块将隐蔽的人体结构前称(如对称)利用的部件隔开。ZJU-Mocap、AIST和HUMB数据集的实验表明,我们的方法大大超出最近调整的单层情况的方法。