We present a novel approach to single-view face relighting in the wild. Handling non-diffuse effects, such as global illumination or cast shadows, has long been a challenge in face relighting. Prior work often assumes Lambertian surfaces, simplified lighting models or involves estimating 3D shape, albedo, or a shadow map. This estimation, however, is error-prone and requires many training examples with lighting ground truth to generalize well. Our work bypasses the need for accurate estimation of intrinsic components and can be trained solely on 2D images without any light stage data, multi-view images, or lighting ground truth. Our key idea is to leverage a conditional diffusion implicit model (DDIM) for decoding a disentangled light encoding along with other encodings related to 3D shape and facial identity inferred from off-the-shelf estimators. We also propose a novel conditioning technique that eases the modeling of the complex interaction between light and geometry by using a rendered shading reference to spatially modulate the DDIM. We achieve state-of-the-art performance on standard benchmark Multi-PIE and can photorealistically relight in-the-wild images. Please visit our page: https://diffusion-face-relighting.github.io
翻译:DiFaReli:扩散人脸重照技术
我们提出了一种新颖的单视角野外人脸重照方法。处理全局光照或阴影等非漫反射效应长期以来一直是人脸重照的挑战。以往的工作往往假设兰伯特表面,使用简化的光照模型或涉及估计3D形状、反照率或阴影图。然而,这种估计方式容易出错,并需要许多带有光照真实值的训练样例才能很好地泛化。我们的工作可以绕过准确估计内在成分的需要,且仅可通过二维图像进行训练,无需任何光线阶段数据、多视图图像或光照真实值。我们的关键思想是利用条件扩散隐式模型 (DDIM) 解码解耦的光编码以及其他与三维形状和面部特征相关的编码,这些编码可以由现成的估计器推断得出。我们还提出了一种新的调节技术,通过使用渲染的遮蔽参考来空间调制 DDIM,可以简化光与几何体之间复杂的交互建模。我们在标准的 Multi-PIE 测试中取得了最先进的性能,并可以对野外图像进行真实照明。请访问我们的页面:https://diffusion-face-relighting.github.io