This paper presents the first significant work on directly predicting 3D face landmarks on neural radiance fields (NeRFs), without using intermediate representations such as 2D images, depth maps, or point clouds. Our 3D coarse-to-fine Face Landmarks NeRF (FLNeRF) model efficiently samples from the NeRF on the whole face with individual facial features for accurate landmarks. To mitigate the limited number of facial expressions in the available data, local and non-linear NeRF warp is applied at facial features in fine scale to simulate large emotions range, including exaggerated facial expressions (e.g., cheek blowing, wide opening mouth, eye blinking), for training FLNeRF. With such expression augmentation, our model can predict 3D landmarks not limited to the 20 discrete expressions given in the data. Robust 3D NeRF facial landmarks contribute to many downstream tasks. As an example, we modify MoFaNeRF to enable high-quality face editing and swapping using face landmarks on NeRF, allowing more direct control and wider range of complex expressions. Experiments show that the improved model using landmarks achieves comparable to better results. Github link: https://github.com/ZHANG1023/FLNeRF.
翻译:本文介绍了在不使用 2D 图像、深度地图或点云等中间表达方式的情况下,直接预测神经光亮场( NERF) 3D 脸部地标的第一个重要工作。 我们的3D 粗体到软体脸标记 NERF (FLNERF) 模型从整个 NERF 模型中高效地从整个面部样本,有个人面部特征,以精确的地标。为了减少现有数据中面部表达的有限数量,地方和非线性NERF 扭曲在微小的面部特征中应用,以模拟大范围的情绪范围,包括放大的面部表达方式(例如脸部吹风、大开口、眨眼),用于培训FLNERF。有了这种表达方式,我们的模型可以预测3D的地标并不局限于数据中提供的20个独立表达方式。 Robust 3D NERF 面部脸部地标有助于许多下游任务。举例说,我们修改MoFANERF, 以便利用NRF 的面部标志进行高质量面部编辑和交换,允许更直接控制和更广泛的复杂表达。 实验显示,使用AS102 RAFM/NU3 的改进的模型可以比较更好的结果。