3D-aware GANs offer new capabilities for creative content editing, such as view synthesis, while preserving the editing capability of their 2D counterparts. Using GAN inversion, these methods can reconstruct an image or a video by optimizing/predicting a latent code and achieve semantic editing by manipulating the latent code. However, a model pre-trained on a face dataset (e.g., FFHQ) often has difficulty handling faces with out-of-distribution (OOD) objects, (e.g., heavy make-up or occlusions). We address this issue by explicitly modeling OOD objects in face videos. Our core idea is to represent the face in a video using two neural radiance fields, one for in-distribution and the other for out-of-distribution data, and compose them together for reconstruction. Such explicit decomposition alleviates the inherent trade-off between reconstruction fidelity and editability. We evaluate our method's reconstruction accuracy and editability on challenging real videos and showcase favorable results against other baselines.
翻译:3D-aware GANs 提供了创造性内容编辑的新能力, 如视图合成, 同时保留了 2D 对应对象的编辑能力 。 使用 GAN 翻版, 这些方法可以通过优化/ 配置潜伏代码来重建图像或视频, 并通过操纵潜伏代码实现语义编辑 。 然而, 在面部数据集( 例如 FFHQ ) 上预先培训的模型往往难以处理外向( OOOD) 对象的面部( 例如, 重制成形或隐蔽) 。 我们通过在面对面视频中明确模拟 OOOD 对象来解决这个问题。 我们的核心想法是使用两个神经光谱字段( 一个用于分布, 另一个用于分配数据)在视频中代表面部, 并把它们组合在一起用于重建 。 这种明显的分解会减轻重建忠度和可编辑性之间的内在交易。 我们评估了我们的方法在挑战真实视频时的重建准确性和编辑性, 并用其他基线展示有利的结果 。