We propose Neural Actor (NA), a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses. Our method is built upon recent neural scene representation and rendering works which learn representations of geometry and appearance from only 2D images. While existing works demonstrated compelling rendering of static scenes and playback of dynamic scenes, photo-realistic reconstruction and rendering of humans with neural implicit methods, in particular under user-controlled novel poses, is still difficult. To address this problem, we utilize a coarse body model as the proxy to unwarp the surrounding 3D space into a canonical pose. A neural radiance field learns pose-dependent geometric deformations and pose- and view-dependent appearance effects in the canonical space from multi-view video input. To synthesize novel views of high fidelity dynamic geometry and appearance, we leverage 2D texture maps defined on the body model as latent variables for predicting residual deformations and the dynamic appearance. Experiments demonstrate that our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses. Furthermore, our method also supports body shape control of the synthesized results.
翻译:我们提出了一种从任意角度和在任意可控外观下对人进行高质量合成的新方法,即神经立体(NA),这是从任意角度和在任意可控外观下对人进行高质量合成的新方法。我们的方法以最近的神经场面展示和工程为基础,从仅从 2D 图像中学习几何和外观的表现形式。虽然现有作品展示了静态场景和回放动态场景的令人惊叹的情景,但光现实的重建以及以神经隐含方法、特别是用户控制的新型外观为主的人的外观,仍然很困难。为了解决这个问题,我们使用粗体模型作为将周围3D 空间变异变成金形的替代物。一个神经光亮场从多视图视频输入中学习了在罐体空间中依赖的几何形变形和视貌外观的外观效应。为了综合对高度忠诚动态的动态几面和外观的新观点,我们利用了2D质图作为预测残余变形和动态外观的潜在变量。实验表明,我们的方法比回放时的状态质量要好得多,从新的合成合成到新的合成,甚至支持了我们的合成结果。