Diffusion models achieve outstanding generative performance in various domains. Despite their great success, they lack semantic latent space which is essential for controlling the generative process. To address the problem, we propose asymmetric reverse process (Asyrp) which discovers the semantic latent space in frozen pretrained diffusion models. Our semantic latent space, named h-space, has nice properties for accommodating semantic image manipulation: homogeneity, linearity, robustness, and consistency across timesteps. In addition, we introduce a principled design of the generative process for versatile editing and quality boost ing by quantifiable measures: editing strength of an interval and quality deficiency at a timestep. Our method is applicable to various architectures (DDPM++, iD- DPM, and ADM) and datasets (CelebA-HQ, AFHQ-dog, LSUN-church, LSUN- bedroom, and METFACES). Project page: https://kwonminki.github.io/Asyrp/
翻译:Diffusion模型在各个领域中取得了出色的生成性能。尽管它们取得了巨大的成功,但缺乏对生成过程进行控制所必需的语义潜空间。为了解决这个问题,我们提出了非对称反向过程(Asyrp),它在预训练的冻结Diffusion模型中发现了语义潜空间。我们的语义潜空间名为h空间,具有适合语义图像操作的良好属性:同质性、线性、鲁棒性和时间步骤间的一致性。此外,我们引入了一种基于可量化的措施设计生成过程的原则,以进行多用途的编辑和质量提升:时间间隔的编辑强度和时间步骤的质量缺陷。我们的方法适用于各种架构(DDPM ++ 、iD-DPM 和ADM)和数据集(CelebA-HQ、AFHQ-dog、LSUN-church、LSUN-bedroom 和METFACES)。项目页面:https://kwonminki.github.io/Asyrp/。