自由漫步：无需训练的能量引导条件漫步模型 (FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model)

Recently, conditional diffusion models have gained popularity in numerous applications due to their exceptional generation ability. However, many existing methods are training-required. They need to train a time-dependent classifier or a condition-dependent score estimator, which increases the cost of constructing conditional diffusion models and is inconvenient to transfer across different conditions. Some current works aim to overcome this limitation by proposing training-free solutions, but most can only be applied to a specific category of tasks and not to more general conditions. In this work, we propose a training-Free conditional Diffusion Model (FreeDoM) used for various conditions. Specifically, we leverage off-the-shelf pre-trained networks, such as a face detection model, to construct time-independent energy functions, which guide the generation process without requiring training. Furthermore, because the construction of the energy function is very flexible and adaptable to various conditions, our proposed FreeDoM has a broader range of applications than existing training-free methods. FreeDoM is advantageous in its simplicity, effectiveness, and low cost. Experiments demonstrate that FreeDoM is effective for various conditions and suitable for diffusion models of diverse data domains, including image and latent code domains.

翻译：近年来，由于其出色的生成能力，条件性扩散模型在多种应用中日益受到关注。然而，很多现有方法都需要进行训练，需要训练一个时间依赖分类器或条件依赖的得分估计器，这增加了构建条件扩散模型的成本，不便于在不同条件下转移。一些现有的工作旨在通过提出无需训练的解决方案来克服这种限制，但大多数只能应用于特定类别的任务，而不能适用于更普遍的条件。在本工作中，我们提出了一种用于各种条件的无需训练的条件漫步模型（FreeDoM）。具体来说，我们利用现成的预训练网络，例如人脸检测模型，构建时间独立的能源函数，引导生成过程而无需进行训练。此外，由于能量函数的构建非常灵活和可适应各种条件，我们提出的FreeDoM比现有的无需训练方法具有更广泛的应用范围。FreeDoM 在简洁性、有效性和低成本方面具有优势。实验证明，FreeDoM 适用于各种条件，适合多样化数据域的扩散模型，包括图像和潜在代码域。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

CVPR2022 | 多模态Transformer用于视频分割效果惊艳

专知会员服务

42+阅读 · 2022年3月12日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

CVPR 2021｜无需干净图像的自监督图像降噪

专知会员服务

39+阅读 · 2021年3月29日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日