Applying powerful generative denoising diffusion models (DDMs) for downstream tasks such as image semantic editing usually requires either fine-tuning pre-trained DDMs or learning auxiliary editing networks. In this work, we achieve SOTA semantic control performance on various application settings by optimizing the denoising trajectory solely via frozen DDMs. As one of the first optimization-based diffusion editing work, we start by seeking a more comprehensive understanding of the intermediate high-dimensional latent spaces by theoretically and empirically analyzing their probabilistic and geometric behaviors in the Markov chain. We then propose to further explore the critical step in the denoising trajectory that characterizes the convergence of a pre-trained DDM. Last but not least, we further present our method to search for the semantic subspaces boundaries for controllable manipulation, by guiding the denoising trajectory towards the targeted boundary at the critical convergent step. We conduct extensive experiments on various DPMs architectures (DDPM, iDDPM) and datasets (CelebA, CelebA-HQ, LSUN-church, LSUN-bedroom, AFHQ-dog) with different resolutions (64, 256) as empirical demonstrations.
翻译:为图像语义编辑等下游任务应用强大的基因分解扩散模型(DDMs)通常需要微调经过事先训练的DDMs或学习辅助编辑网络。在这项工作中,我们通过仅通过冻结DDDMs优化分解轨道,实现不同应用环境的SOTA静默控制性功能。作为第一个基于优化的传播编辑工作之一,我们首先寻求更全面地了解中间高维潜伏空间,从理论上和实验上分析马尔科夫链条中这些空间的概率和几何行为。然后,我们提议进一步探索作为预先训练DDDDM趋同特征的分解轨道的关键步骤。最后但并非最不重要的是,我们进一步展示了我们寻找可控的语义子空间边界的方法,在关键趋同点步骤上引导通往目标边界的分解轨道。我们从各种DPMS结构(DDPM、IDPM)和数据集(CelebA、CelibeA-HQ、LSUN-chn、LSUN-BAHG)与不同的经验演示室、AHQ-HDR)进行广泛的实验性实验性试验。