3D face reconstruction from a single image is a task that has garnered increased interest in the Computer Vision community, especially due to its broad use in a number of applications such as realistic 3D avatar creation, pose invariant face recognition and face hallucination. Since the introduction of the 3D Morphable Model in the late 90's, we witnessed an explosion of research aiming at particularly tackling this task. Nevertheless, despite the increasing level of detail in the 3D face reconstructions from single images mainly attributed to deep learning advances, finer and highly deformable components of the face such as the tongue are still absent from all 3D face models in the literature, although being very important for the realness of the 3D avatar representations. In this work we present the first, to the best of our knowledge, end-to-end trainable pipeline that accurately reconstructs the 3D face together with the tongue. Moreover, we make this pipeline robust in "in-the-wild" images by introducing a novel GAN method tailored for 3D tongue surface generation. Finally, we make publicly available to the community the first diverse tongue dataset, consisting of 1,800 raw scans of 700 individuals varying in gender, age, and ethnicity backgrounds. As we demonstrate in an extensive series of quantitative as well as qualitative experiments, our model proves to be robust and realistically captures the 3D tongue structure, even in adverse "in-the-wild" conditions.
翻译:3D面部从单一图像中重建是一个任务,它引起了对计算机视野社区越来越多的兴趣,特别是因为它在许多应用应用中广泛使用,例如现实的 3D 阿凡达的创建,它代表了不易的面部识别和面部幻觉。自从90年代末开始采用3D负负式模型以来,我们目睹了一场特别旨在解决这项任务的研究爆炸。然而,尽管3D面部的重建越来越详细,主要由于深刻的学习进步、更精细和高度变形的面部成像,例如语言在文献中的所有3D面部模型中仍然缺乏,尽管对于3D阿凡达代表的真实性非常重要。在这项工作中,我们展示了第一个,根据我们的知识,即端到端到端到端的管道,旨在与舌一起准确地重建3D脸部。此外,我们通过引入一种适合3D舌层一代的新型GAN方法,使这个管道更加坚固。最后,我们向社区公开展示了第一个多样化的舌头数据集,包括1 800个原始和定量的种族背景,从1 800个层次到我们展示了一个完整的质量层次。