Recent neural human representations can produce high-quality multi-view rendering but require using dense multi-view inputs and costly training. They are hence largely limited to static models as training each frame is infeasible. We present HumanNeRF - a generalizable neural representation - for high-fidelity free-view synthesis of dynamic humans. Analogous to how IBRNet assists NeRF by avoiding per-scene training, HumanNeRF employs an aggregated pixel-alignment feature across multi-view inputs along with a pose embedded non-rigid deformation field for tackling dynamic motions. The raw HumanNeRF can already produce reasonable rendering on sparse video inputs of unseen subjects and camera settings. To further improve the rendering quality, we augment our solution with an appearance blending module for combining the benefits of both neural volumetric rendering and neural texture blending. Extensive experiments on various multi-view dynamic human datasets demonstrate the generalizability and effectiveness of our approach in synthesizing photo-realistic free-view humans under challenging motions and with very sparse camera view inputs.
翻译:最近人类神经表现方式可以产生高质量的多视图分析,但需要使用密集的多视图投入和昂贵的培训,因此,它们基本上限于静态模型,因为每个框架的培训是行不通的。我们介绍人性神经反应(HOHRNERF)——一种一般可实现的神经表现方式,用来对动态人类进行高不贞不屈不挠的自由视觉合成。与IBRNet如何通过避免单层培训来帮助神经反应系统相比,人性神经反应模式在多视图投入中采用一个综合的像素调整特征,同时使用一个装有内嵌的非硬化的外观外观场来应对动态运动。原始人性人类性神经反应系统已经能够对隐蔽的外观和相机设置的微弱的视频输入产生合理的外观。为了进一步提高外观质量,我们用一个外观混合模块来扩大我们的解决方案,将神经体积和神经质纹理的混合两种功能结合起来。关于多种多视图动态人类数据集的广泛实验表明,我们在有挑战的移动和非常稀少的摄像视图输入下将摄影-现实自由视觉人性人性综合的方法具有普遍性和有效性。