迈向实用插插插和插插插插插插插出模型 (Towards Practical Plug-and-Play Diffusion Models) - 专知论文

会员服务 ·

0

Guidance · MoDELS · Performer · 标注 · Processing（编程语言） ·

2022 年 12 月 12 日

Towards Practical Plug-and-Play Diffusion Models

翻译：迈向实用插插插和插插插插插插插出模型

Hyojun Go,Yunsung Lee,Jin-Young Kim,Seunghyun Lee,Myeongho Jeong,Hyun Seung Lee,Seungtaek Choi

Diffusion-based generative models have achieved remarkable success in image generation. Their guidance formulation allows an external model to plug-and-play control the generation process for various tasks without fine-tuning the diffusion model. However, the direct use of publicly available off-the-shelf models for guidance fails due to their poor performance on noisy inputs. For that, the existing practice is to fine-tune the guidance models with labeled data corrupted with noises. In this paper, we argue that this practice has limitations in two aspects: (1) performing on inputs with extremely various noises is too hard for a single model; (2) collecting labeled datasets hinders scaling up for various tasks. To tackle the limitations, we propose a novel strategy that leverages multiple experts where each expert is specialized in a particular noise range and guides the reverse process at its corresponding timesteps. However, as it is infeasible to manage multiple networks and utilize labeled data, we present a practical guidance framework termed Practical Plug-And-Play (PPAP), which leverages parameter-efficient fine-tuning and data-free knowledge transfer. We exhaustively conduct ImageNet class conditional generation experiments to show that our method can successfully guide diffusion with small trainable parameters and no labeled data. Finally, we show that image classifiers, depth estimators, and semantic segmentation models can guide publicly available GLIDE through our framework in a plug-and-play manner.

翻译：在图像生成方面,基于扩散的基因模型取得了显著的成功。它们的指导性设计使外部模型能够在不微调扩散模型的情况下控制各种任务的生成过程。然而,直接使用公开可得的现成指导模型,因其在噪音输入方面的表现不佳而未能成功。为此,现行做法是用标签数据对指导模型进行微调,而标签数据因噪音而腐蚀。在本文件中,我们争辩说,这种做法在两个方面有局限性:(1) 以极其多种噪音进行投入的操作对于单一模型来说过于困难;(2) 收集标签数据集阻碍扩大各种任务的规模。为克服这些限制,我们提议采用新的战略,利用多位专家,即每位专家在特定噪音范围内专门使用现成的现成指导模式指导反向进程。然而,由于管理多个网络和使用标签数据不可行,我们提出了一个实用的指导框架,即实用的普卢格和普莱(PPAP),它利用参数高效的微调和无数据知识转移。我们详尽的图像网络分类化工具,我们用不易懂的升级的模型指导,我们用不易在公开的图像分类中进行分类和升级的实验室实验,我们用这种方法展示。

0

相关内容

Guidance

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

剪应力响应的二氢杨梅素Pickering乳液用于治疗动脉粥样硬化的研究

国家自然科学基金

0+阅读 · 2015年12月31日

牛蒡子中Arctignan A，Lappaol C及其衍生物的合成和抗白血病活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

液相还原法制备Heusler合金纳米颗粒及其结构和性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

可压缩Navier-Stokes方程的一些数学问题

国家自然科学基金

0+阅读 · 2012年12月31日

基于RGD配体设计的CdTe量子点靶向探针的直接合成及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

肾上腺源性及原发性高血压线粒体tRNAIle、tRNALeu(UUR)和tRNAlys基因突变的差异对比研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Asperger综合症情绪认知的神经心理调控机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Towards Agile Text Classifiers for Everyone

Arxiv

0+阅读 · 2023年2月13日

A Practical Mixed Precision Algorithm for Post-Training Quantization

Arxiv

0+阅读 · 2023年2月10日

Understanding and contextualising diffusion models

Arxiv

0+阅读 · 2023年2月10日

Q-Diffusion: Quantizing Diffusion Models

Arxiv

0+阅读 · 2023年2月8日

A Survey on Generative Diffusion Model

Arxiv

45+阅读 · 2022年9月6日

Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review

Arxiv

17+阅读 · 2022年5月3日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Arxiv

15+阅读 · 2019年3月18日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

Towards Understanding and Answering Multi-Sentence Recommendation Questions on Tourism

Arxiv

15+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Towards Agile Text Classifiers for Everyone

Arxiv

0+阅读 · 2023年2月13日

A Practical Mixed Precision Algorithm for Post-Training Quantization

Arxiv

0+阅读 · 2023年2月10日

Understanding and contextualising diffusion models

Arxiv

0+阅读 · 2023年2月10日

Q-Diffusion: Quantizing Diffusion Models

Arxiv

0+阅读 · 2023年2月8日

A Survey on Generative Diffusion Model

Arxiv

45+阅读 · 2022年9月6日

Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review

Arxiv

17+阅读 · 2022年5月3日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Arxiv

15+阅读 · 2019年3月18日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

Towards Understanding and Answering Multi-Sentence Recommendation Questions on Tourism

Arxiv

15+阅读 · 2018年1月5日

相关基金

剪应力响应的二氢杨梅素Pickering乳液用于治疗动脉粥样硬化的研究

国家自然科学基金

0+阅读 · 2015年12月31日

牛蒡子中Arctignan A，Lappaol C及其衍生物的合成和抗白血病活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

液相还原法制备Heusler合金纳米颗粒及其结构和性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

可压缩Navier-Stokes方程的一些数学问题

国家自然科学基金

0+阅读 · 2012年12月31日

基于RGD配体设计的CdTe量子点靶向探针的直接合成及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

肾上腺源性及原发性高血压线粒体tRNAIle、tRNALeu(UUR)和tRNAlys基因突变的差异对比研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Asperger综合症情绪认知的神经心理调控机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员