Neural Radiance Fields (NeRF) are compelling techniques for modeling dynamic 3D scenes from 2D image collections. These volumetric representations would be well suited for synthesizing novel facial expressions but for two problems. First, deformable NeRFs are object agnostic and model holistic movement of the scene: they can replay how the motion changes over time, but they cannot alter it in an interpretable way. Second, controllable volumetric representations typically require either time-consuming manual annotations or 3D supervision to provide semantic meaning to the scene. We propose a controllable neural representation for face self-portraits (CoNFies), that solves both of these problems within a common framework, and it can rely on automated processing. We use automated facial action recognition (AFAR) to characterize facial expressions as a combination of action units (AU) and their intensities. AUs provide both the semantic locations and control labels for the system. CoNFies outperformed competing methods for novel view and expression synthesis in terms of visual and anatomic fidelity of expressions.
翻译:从 2D 图像收藏中建模动态 3D 场景时, 神经辐射场( NeRF) 是极具说服力的技术。 这些量表将非常适合合成新的面部表情, 但也有两个问题。 首先, 变形的 NERF 是对象不可知性和模型的场景整体移动: 它们可以重现运动随时间变化的方式, 但不能以可解释的方式改变运动。 其次, 可控的体积表情通常需要花费时间的手工说明或3D 监督来为场景提供语义意义。 我们为面部自动画( CoNFTIes) 提议一个可控的神经表情, 在共同框架内解决这两个问题, 并且可以依靠自动处理 。 我们使用自动面部动作识别( AAFAR) 来将面部表现描述为动作单位( AU) 及其强度的组合。 AUS 提供语义位置和控制标签。 CoNFies 超越了在视觉 和 直观性表达方式上的新视图和表达合成的竞合方法。