This paper presents the first significant work on directly predicting 3D face landmarks on neural radiance fields (NeRFs), without using intermediate representations such as 2D images, depth maps, or point clouds. Our 3D coarse-to-fine Face Landmarks NeRF (FLNeRF) model efficiently samples from the NeRF on the whole face with individual facial features for accurate landmarks. To mitigate the limited number of facial expressions in the available data, local and non-linear NeRF warp is applied at facial features in fine scale to simulate large emotions range, including exaggerated facial expressions (e.g., cheek blowing, wide opening mouth, eye blinking), for training FLNeRF. With such expression augmentation, our model can predict 3D landmarks not limited to the 20 discrete expressions given in the data. Robust 3D NeRF facial landmarks contribute to many downstream tasks. As an example, we modify MoFaNeRF to enable high-quality face editing and swapping using face landmarks on NeRF, allowing more direct control and wider range of complex expressions. Experiments show that the improved model using landmarks achieves comparable to better results.
翻译:本文介绍了在不使用 2D 图像、深度地图或点云等中间表示法的情况下,直接预测神经光场( NERFs ) 3D 脸部标志的首次重要工作。 我们的3D 粗面面面部标记 NERF (FLNERF) 模型从整个面部中有效抽样,有个人面部特征,以精确的标志。为了减少现有数据中面部表现的有限数量,地方和非线面部标记应用在微小的面部特征上,以模拟大度的情绪范围,包括夸大面部表示法(例如脸部吹风、大张嘴、眨眼),用于培训FLNERF。有了这种表达法,我们的模型可以预测3D标志不仅限于数据中的20个独立表达法。 Robust 3D NERF 面部标志有助于许多下游任务。举例说,我们修改MoFANERF 以便利用NRF 的脸部标志进行高质量面部编辑和交换,允许更直接控制和更广泛的复杂表达方式。实验显示,使用标志的改进模型可以比得更好。