We propose a robust method for learning neural implicit functions that can reconstruct 3D human heads with high-fidelity geometry from low-view inputs. We represent 3D human heads as the zero level-set of a composed signed distance field that consists of a smooth template, a non-rigid deformation, and a high-frequency displacement field. The template represents identity-independent and expression-neutral features, which is trained on multiple individuals, along with the deformation network. The displacement field encodes identity-dependent geometric details, trained for each specific individual. We train our network in two stages using a coarse-to-fine strategy without 3D supervision. Our experiments demonstrate that the geometry decomposition and two-stage training make our method robust and our model outperforms existing methods in terms of reconstruction accuracy and novel view synthesis under low-view settings. Additionally, the pre-trained template serves a good initialization for our model to adapt to unseen individuals.
翻译:我们提出了一种强大的神经隐含函数学习方法,可以从低视角输入中重建高保真度几何形状的三维人头。我们将三维人头表示为由平滑的模板、非刚性形变和高频位移场组成的组合有符号距离场的零级集合。模板表示与身份无关和表情中性的特征,该特征通过多个个体的训练来学习,同时伴随形变网络;位移场则编码了与身份有关的几何细节,并针对每个特定个体进行训练。我们使用粗到精的方式在两个阶段中进行训练而不需要3D监督。我们的实验表明,通过几何分解和两阶段训练,我们的方法更具鲁棒性,并且在低视角下的重建准确性和新视角合成方面优于现有方法。此外,预训练的模板为我们的模型适应未见个体提供了良好的初始化。