Face recognition models embed a face image into a low-dimensional identity vector containing abstract encodings of identity-specific facial features that allow individuals to be distinguished from one another. We tackle the challenging task of inverting the latent space of pre-trained face recognition models without full model access (i.e. black-box setting). A variety of methods have been proposed in literature for this task, but they have serious shortcomings such as a lack of realistic outputs, long inference times, and strong requirements for the data set and accessibility of the face recognition model. Through an analysis of the black-box inversion problem, we show that the conditional diffusion model loss naturally emerges and that we can effectively sample from the inverse distribution even without an identity-specific loss. Our method, named identity denoising diffusion probabilistic model (ID3PM), leverages the stochastic nature of the denoising diffusion process to produce high-quality, identity-preserving face images with various backgrounds, lighting, poses, and expressions. We demonstrate state-of-the-art performance in terms of identity preservation and diversity both qualitatively and quantitatively. Our method is the first black-box face recognition model inversion method that offers intuitive control over the generation process and does not suffer from any of the common shortcomings from competing methods.
翻译:摘要:人脸识别模型将人脸图像嵌入一个低维身份向量中,其中包含身份特定的抽象编码,允许区分不同个体。我们面临的一个棘手问题是,如何在没有完全模型访问权限的情况下(即黑盒设置下),反演预先训练好的人脸识别模型的潜在空间。文献中提出了各种方法,但它们存在严重的缺点:缺乏现实输出、推理时间过长,对数据集和人脸识别模型的可访问性要求很高。通过分析黑盒反演问题,我们展示了条件扩散模型损失的自然派生,并展示了在没有身份特定损失的情况下有效地从反演分布中进行采样的方法。我们的方法名为身份去噪扩散概率模型 (ID3PM),利用去噪扩散过程的随机性,生成具有不同背景、光照、姿态和表情的高质量、保持身份的人脸图像。我们展示了优异的身份保护和多样性表现,定性和定量都得到了证明。我们的方法是第一个黑盒人脸识别模型反演方法,它提供了直观的控制生成过程的方式,并且没有受到竞争方法的任何常见缺点的影响。