We present Instant Volumetric Head Avatars (INSTA), a novel approach for reconstructing photo-realistic digital avatars instantaneously. INSTA models a dynamic neural radiance field based on neural graphics primitives embedded around a parametric face model. Our pipeline is trained on a single monocular RGB portrait video that observes the subject under different expressions and views. While state-of-the-art methods take up to several days to train an avatar, our method can reconstruct a digital avatar in less than 10 minutes on modern GPU hardware, which is orders of magnitude faster than previous solutions. In addition, it allows for the interactive rendering of novel poses and expressions. By leveraging the geometry prior of the underlying parametric face model, we demonstrate that INSTA extrapolates to unseen poses. In quantitative and qualitative studies on various subjects, INSTA outperforms state-of-the-art methods regarding rendering quality and training time.
翻译:我们提出了一种新颖的方法,称为 Instant Volumetric Head Avatars (INSTA),用于即时重建照片级别的真实数字化头像。INSTA 基于嵌入在参数化人脸模型周围的神经图形基元,建立一个动态神经辐射场。我们的流程基于单个单视角 RGB 肖像视频进行训练,该视频记录了主体在不同表情和视角下的情况。虽然现有技术需要数天时间来训练头像,但我们的方法可以在现代 GPU 硬件上在不到 10 分钟内重建数字头像,比先前的解决方案快几个数量级。此外,它还允许交互式地渲染新的姿势和表情。通过利用底层参数化人脸模型的几何先验,我们证明了 INSTA 可以推广到未见过的姿势。在针对各种主体的定量和定性研究中,INSTA 在渲染质量和训练时间方面的表现均优于现有技术。