End-to-End扩散潜在优化提升分类器引导 (End-to-End Diffusion Latent Optimization Improves Classifier Guidance)

Classifier guidance -- using the gradients of an image classifier to steer the generations of a diffusion model -- has the potential to dramatically expand the creative control over image generation and editing. However, currently classifier guidance requires either training new noise-aware models to obtain accurate gradients or using a one-step denoising approximation of the final generation, which leads to misaligned gradients and sub-optimal control. We highlight this approximation's shortcomings and propose a novel guidance method: Direct Optimization of Diffusion Latents (DOODL), which enables plug-and-play guidance by optimizing diffusion latents w.r.t. the gradients of a pre-trained classifier on the true generated pixels, using an invertible diffusion process to achieve memory-efficient backpropagation. Showcasing the potential of more precise guidance, DOODL outperforms one-step classifier guidance on computational and human evaluation metrics across different forms of guidance: using CLIP guidance to improve generations of complex prompts from DrawBench, using fine-grained visual classifiers to expand the vocabulary of Stable Diffusion, enabling image-conditioned generation with a CLIP visual encoder, and improving image aesthetics using an aesthetic scoring network.

翻译：分类器引导——利用图像分类器的梯度来引导扩散模型的生成和编辑，有潜力极大地扩展图像生成和编辑的创造性控制。然而，目前分类器引导需要训练新的噪声感知模型以获得精确的梯度，或者使用一步去噪的逼近方法来得到最终生成的梯度，这导致梯度失配和次优的控制。我们强调了这种逼近方法的不足，并提出了一种新的引导方法：基于扩散潜在的直接优化（DOODL），它通过优化扩散潜在的方式，以获得在真实生成的像素上预先训练的分类器的梯度。使用可逆扩散过程实现内存有效的反向传播。展示了更精确引导的潜力，DOODL在不同形式的引导下（使用DrawBench复杂提示的CLIP引导，使用细粒度视觉分类器扩展稳定扩散的词汇，使用CLIP视觉编码器实现图像条件生成，并使用审美评分网络改善图像审美）在计算和人类评估指标上优于一步分类器引导方法。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

【NeurIPS 2022】Stable Diffusion采样速度翻倍！清华提出扩散模型高效求解器

专知会员服务

49+阅读 · 2022年11月17日

【NeurIPS 2022】扩散模型的深度平衡方法

专知会员服务

40+阅读 · 2022年11月5日