We present PHORHUM, a novel, end-to-end trainable, deep neural network methodology for photorealistic 3D human reconstruction given just a monocular RGB image. Our pixel-aligned method estimates detailed 3D geometry and, for the first time, the unshaded surface color together with the scene illumination. Observing that 3D supervision alone is not sufficient for high fidelity color reconstruction, we introduce patch-based rendering losses that enable reliable color reconstruction on visible parts of the human, and detailed and plausible color estimation for the non-visible parts. Moreover, our method specifically addresses methodological and practical limitations of prior work in terms of representing geometry, albedo, and illumination effects, in an end-to-end model where factors can be effectively disentangled. In extensive experiments, we demonstrate the versatility and robustness of our approach. Our state-of-the-art results validate the method qualitatively and for different metrics, for both geometric and color reconstruction.
翻译:我们提出PHORHUM, 这是一种新颖的、端到端的深神经网络方法, 用于摄影现实3D人的重建, 仅以单方的 RGB 图像为单位。 我们的像素调整方法估算了详细的 3D 几何, 首次将未遮蔽的表面颜色与场景照明一起进行。 我们观察到, 仅靠3D 监督不足以进行高忠诚度的彩色重建, 我们引入了基于补丁的覆盖性损失, 以便能够对人的可见部分进行可靠的彩色重建, 并且对不可见部分进行详细和可信的彩色估计。 此外, 我们的方法具体地解决了先前工作在几何、 升温和光化效果方面的方法和实际局限性, 也就是在一种端到端模型中, 各种因素可以有效地分解。 在广泛的实验中, 我们展示了我们方法的多功能性和坚固性。 我们的状态结果验证了方法的质量, 以及不同测量和颜色重建的测量方法。