We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting. We only require a short video clip of the person to create the avatar and assume no knowledge about the lighting environment. We present a novel framework to model humans while disentangling their geometry, texture, and also lighting environment from monocular RGB videos. To simplify this otherwise ill-posed task we first estimate the coarse geometry and texture of the person via SMPL+D model fitting and then learn an articulated neural representation for photorealistic image generation. RANA first generates the normal and albedo maps of the person in any given target body pose and then uses spherical harmonics lighting to generate the shaded image in the target lighting environment. We also propose to pretrain RANA using synthetic images and demonstrate that it leads to better disentanglement between geometry and texture while also improving robustness to novel body poses. Finally, we also present a new photorealistic synthetic dataset, Relighting Humans, to quantitatively evaluate the performance of the proposed approach.
翻译:我们提出RANA,这是一个在任意角度、身体姿势和照明下对人进行光现实合成的可点亮和清晰的神经动因。我们只需要一个人的短短视频短片来创建动因,而没有关于照明环境的知识。我们提出一个新的框架来模拟人类,同时将他们的几何、纹理和光环境与单眼 RGB 视频脱钩。为了简化这一否则不正确的任务,我们首先通过SMPL+D 模型的安装来估计一个人的粗微几何和纹理,然后学习一个光现实图像生成的清晰神经图。RANA首先制作了任何特定目标身体的人的正常和反照像图,然后使用球形的相光光照明来生成目标照明环境中的阴暗图像。我们还提议使用合成图像对RANA 进行预演练, 并展示它导致几何和纹理之间更好的脱钩, 同时提高新体的坚固度。最后,我们还展示了一个新的摄影现实合成数据集,重新点亮人类,以便量化地评估拟议方法的性。