发现扩散模型语义潜在空间中可解释的方向 (Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models) - 专知论文

会员服务 ·

0

有向 · 潜在 · 去噪 · MoDELS · GANs ·

2023 年 3 月 20 日

Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models

翻译：发现扩散模型语义潜在空间中可解释的方向

René Haas,Inbar Huberman-Spiegelglas,Rotem Mulayoff,Tomer Michaeli

Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs). However, despite their widespread use in image synthesis and editing applications, their latent space is still not as well understood. Recently, a semantic latent space for DDMs, coined `$h$-space', was shown to facilitate semantic image editing in a way reminiscent of GANs. The $h$-space is comprised of the bottleneck activations in the DDM's denoiser across all timesteps of the diffusion process. In this paper, we explore the properties of h-space and propose several novel methods for finding meaningful semantic directions within it. We start by studying unsupervised methods for revealing interpretable semantic directions in pretrained DDMs. Specifically, we show that global latent directions emerge as the principal components in the latent space. Additionally, we provide a novel method for discovering image-specific semantic directions by spectral analysis of the Jacobian of the denoiser w.r.t. the latent code. Next, we extend the analysis by finding directions in a supervised fashion in unconditional DDMs. We demonstrate how such directions can be found by relying on either a labeled data set of real images or by annotating generated samples with a domain-specific attribute classifier. We further show how to semantically disentangle the found direction by simple linear projection. Our approaches are applicable without requiring any architectural modifications, text-based guidance, CLIP-based optimization, or model fine-tuning.

翻译：去噪扩散模型已成为生成对抗网络（GAN）的有力竞争者。然而，尽管其在图像合成和编辑应用中的广泛使用，但其潜在空间仍不是很好理解。最近，在DDM中发现了一个称为"$h$-space"的语义潜在空间，这种空间在语义图像编辑方面与GAN非常相似。$h$-space由DDM的去噪器在扩散过程的所有时间步骤中的瓶颈激活所组成。在本文中，我们探索了h-空间的属性，并提出了几种在其中寻找有意义的语义方向的新方法。我们开始研究在预训练DDM中揭示可解释的语义方向的无监督方法。具体而言，我们发现全局潜在方向出现为潜在空间中的主成分。此外，我们提供了一种新的方法，通过求解去噪器关于潜在编码的Jacobi矩阵的谱分解来发现图像特异的方向。接下来，我们以无条件DDM为例，扩展了该分析。我们演示了这些方向如何通过依赖带有真实图像的标记数据集或通过用领域特定的属性分类器注释生成的样本来找到。我们进一步展示了如何通过简单的线性投影来进行语义分离。我们的方法适用于不需要任何架构修改、基于文本的指导、CLIP-based优化或模型微调。

0

相关内容

什么是扩散模型？谷歌大脑Calvin Luo最新《扩散模型理解》，带你对基于评分与基于能量的扩散模型的统一视角数学理解

什么是扩散模型？谷歌大脑Calvin Luo最新《扩散模型理解》，带你对基于评分与基于能量的扩散模型的统一视角数学理解

专知会员服务

83+阅读 · 2022年8月27日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【NeurIPS 2021】基于潜在空间能量模型的可控和组分生成

【NeurIPS 2021】基于潜在空间能量模型的可控和组分生成

专知会员服务

17+阅读 · 2021年10月23日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

专知会员服务

58+阅读 · 2020年5月21日

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

专知会员服务

63+阅读 · 2020年5月12日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

专知会员服务

30+阅读 · 2020年3月28日

从多篇论文看扩散模型在文本生成领域的应用

从多篇论文看扩散模型在文本生成领域的应用

PaperWeekly

0+阅读 · 2022年10月20日

从大一统视角理解扩散模型（Diffusion Models）

从大一统视角理解扩散模型（Diffusion Models）

PaperWeekly

3+阅读 · 2022年9月27日

扩散模型背后数学太难了，啃不动？谷歌用统一视角讲明白了

扩散模型背后数学太难了，啃不动？谷歌用统一视角讲明白了

机器之心

1+阅读 · 2022年8月28日

数学推导详解！什么是扩散模型？谷歌大脑Calvin Luo《扩散模型理解》，带你对基于评分与基于变分的扩散模型的统一视角数学理解

数学推导详解！什么是扩散模型？谷歌大脑Calvin Luo《扩散模型理解》，带你对基于评分与基于变分的扩散模型的统一视角数学理解

专知

4+阅读 · 2022年8月27日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

2018年有意思的几篇GAN论文

2018年有意思的几篇GAN论文

专知

21+阅读 · 2019年1月5日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

探幽深度生成模型的两种方法：VAE和GAN

探幽深度生成模型的两种方法：VAE和GAN

AI前线

15+阅读 · 2018年3月10日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

后外侧颞叶皮层在动词语义加工中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

IFN-γ通过EZH2介导lncRNA调控肝癌中枯否细胞表达Galectin-9的机制

国家自然科学基金

0+阅读 · 2013年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于genistein改构的小分子化合物抗结直肠癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

RhoA/ROCK信号途径在Sema4D介导的肺癌血管生成拟态形成中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像的相空间表示及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

三维空间注意的认知神经机制

国家自然科学基金

0+阅读 · 2009年12月31日

Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts

Arxiv

0+阅读 · 2023年5月8日

Causality-aware Concept Extraction based on Knowledge-guided Prompting

Arxiv

0+阅读 · 2023年5月7日

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

Arxiv

0+阅读 · 2023年5月6日

Deep Generative Models on 3D Representations: A Survey

Arxiv

15+阅读 · 2022年10月27日

Diffusion Models in Vision: A Survey

Arxiv

29+阅读 · 2022年9月10日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Arxiv

10+阅读 · 2022年7月30日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Arxiv

11+阅读 · 2019年9月19日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

相关VIP内容

什么是扩散模型？谷歌大脑Calvin Luo最新《扩散模型理解》，带你对基于评分与基于能量的扩散模型的统一视角数学理解

什么是扩散模型？谷歌大脑Calvin Luo最新《扩散模型理解》，带你对基于评分与基于能量的扩散模型的统一视角数学理解

专知会员服务

83+阅读 · 2022年8月27日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【NeurIPS 2021】基于潜在空间能量模型的可控和组分生成

【NeurIPS 2021】基于潜在空间能量模型的可控和组分生成

专知会员服务

17+阅读 · 2021年10月23日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

【KDD2020-清华大学】理解图表示学习中的负采样，Understanding Negative Sampling in Graph Representation Learning

专知会员服务

58+阅读 · 2020年5月21日

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

专知会员服务

63+阅读 · 2020年5月12日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

【Google-Mila】你的GAN实际上是一个基于能量的模型，你应该使用鉴别器驱动的潜在采样，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

专知会员服务

30+阅读 · 2020年3月28日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

从多篇论文看扩散模型在文本生成领域的应用

从多篇论文看扩散模型在文本生成领域的应用

PaperWeekly

0+阅读 · 2022年10月20日

从大一统视角理解扩散模型（Diffusion Models）

从大一统视角理解扩散模型（Diffusion Models）

PaperWeekly

3+阅读 · 2022年9月27日

扩散模型背后数学太难了，啃不动？谷歌用统一视角讲明白了

扩散模型背后数学太难了，啃不动？谷歌用统一视角讲明白了

机器之心

1+阅读 · 2022年8月28日

数学推导详解！什么是扩散模型？谷歌大脑Calvin Luo《扩散模型理解》，带你对基于评分与基于变分的扩散模型的统一视角数学理解

数学推导详解！什么是扩散模型？谷歌大脑Calvin Luo《扩散模型理解》，带你对基于评分与基于变分的扩散模型的统一视角数学理解

专知

4+阅读 · 2022年8月27日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

2018年有意思的几篇GAN论文

2018年有意思的几篇GAN论文

专知

21+阅读 · 2019年1月5日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

探幽深度生成模型的两种方法：VAE和GAN

探幽深度生成模型的两种方法：VAE和GAN

AI前线

15+阅读 · 2018年3月10日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts

Arxiv

0+阅读 · 2023年5月8日

Causality-aware Concept Extraction based on Knowledge-guided Prompting

Arxiv

0+阅读 · 2023年5月7日

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

Arxiv

0+阅读 · 2023年5月6日

Deep Generative Models on 3D Representations: A Survey

Arxiv

15+阅读 · 2022年10月27日

Diffusion Models in Vision: A Survey

Arxiv

29+阅读 · 2022年9月10日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Arxiv

10+阅读 · 2022年7月30日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Arxiv

11+阅读 · 2019年9月19日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

相关基金

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

后外侧颞叶皮层在动词语义加工中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

IFN-γ通过EZH2介导lncRNA调控肝癌中枯否细胞表达Galectin-9的机制

国家自然科学基金

0+阅读 · 2013年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于genistein改构的小分子化合物抗结直肠癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

RhoA/ROCK信号途径在Sema4D介导的肺癌血管生成拟态形成中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像的相空间表示及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

三维空间注意的认知神经机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员