改进矢量量化扩散模型 (Improved Vector Quantized Diffusion Models) - 专知论文

会员服务 ·

0

向量化 · Guidance · MoDELS · 得分 · 样本 ·

2023 年 2 月 8 日

Improved Vector Quantized Diffusion Models

翻译：改进矢量量化扩散模型

Zhicong Tang,Shuyang Gu,Jianmin Bao,Dong Chen,Fang Wen

from arxiv, update reference

Vector quantized diffusion (VQ-Diffusion) is a powerful generative model for text-to-image synthesis, but sometimes can still generate low-quality samples or weakly correlated images with text input. We find these issues are mainly due to the flawed sampling strategy. In this paper, we propose two important techniques to further improve the sample quality of VQ-Diffusion. 1) We explore classifier-free guidance sampling for discrete denoising diffusion model and propose a more general and effective implementation of classifier-free guidance. 2) We present a high-quality inference strategy to alleviate the joint distribution issue in VQ-Diffusion. Finally, we conduct experiments on various datasets to validate their effectiveness and show that the improved VQ-Diffusion suppresses the vanilla version by large margins. We achieve an 8.44 FID score on MSCOCO, surpassing VQ-Diffusion by 5.42 FID score. When trained on ImageNet, we dramatically improve the FID score from 11.89 to 4.83, demonstrating the superiority of our proposed techniques.

翻译：矢量定量扩散(VQ-Difmission)是一种强大的文本到图像合成(VQ-Difmission)的遗传模型,但有时仍能产生低质量的样本或与文本输入有关的微弱图像。我们发现,这些问题主要是由于有缺陷的抽样战略造成的。我们在本文件中提出了进一步提高VQ-Difmission样本质量的两种重要技术。1)我们探讨为离散的分解扩散模型进行无分类指导抽样,并提议更全面和有效地实施无分类的指南。2)我们提出了一个高质量的推论战略,以缓解VQ-Difmission的联合发行问题。最后,我们就各种数据集进行了实验,以验证其有效性,并表明改进的VQ-Difmissmission-dific 将香草版本的利润大幅抑制。我们取得了关于MSCO的8.44国际开发公司分,比VQ-Difmission化得5.42国际化分。我们在接受图像网络培训时,将FID的分数从11.89提高到4.83,显示我们拟议技术的优势。

0

相关内容

向量化

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

复方青黄散对骨髓增生异常综合征患者线粒体DNA拷贝数变异的影响及分子机理的研究

国家自然科学基金

0+阅读 · 2015年12月31日

氯离子非均匀电迁移下钢筋混凝土耐久性劣化的时空分布特征研究

国家自然科学基金

0+阅读 · 2013年12月31日

反应动力学中非绝热效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

研究Netrin-1基因对骨髓干细胞移植治疗心肌梗死的改善作用

国家自然科学基金

0+阅读 · 2012年12月31日

分子中原子的量子理论(QTAIM)用于激发态化学的研究

国家自然科学基金

0+阅读 · 2012年12月31日

中子衍射在不锈钢材料焊接应力辐照松弛行为的研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIF受体乙酰化介导的代谢异常在乳腺癌中的功能及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

靶向Bcl2的小分子抗白血病药物的开发研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于非线性激光差频理论与光腔衰荡技术痕量气体分子浓度测量和组分分析的研究

国家自然科学基金

0+阅读 · 2008年12月31日

DDP: Diffusion Model for Dense Visual Prediction

Arxiv

0+阅读 · 2023年3月30日

PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models

Arxiv

0+阅读 · 2023年3月30日

Discriminative Class Tokens for Text-to-Image Diffusion Models

Arxiv

0+阅读 · 2023年3月30日

Implicit Diffusion Models for Continuous Super-Resolution

Arxiv

0+阅读 · 2023年3月29日

Improving Visual Representation Learning through Perceptual Understanding

Arxiv

0+阅读 · 2023年3月28日

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts

Arxiv

1+阅读 · 2023年3月28日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

Arxiv

25+阅读 · 2019年5月21日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

DDP: Diffusion Model for Dense Visual Prediction

Arxiv

0+阅读 · 2023年3月30日

PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models

Arxiv

0+阅读 · 2023年3月30日

Discriminative Class Tokens for Text-to-Image Diffusion Models

Arxiv

0+阅读 · 2023年3月30日

Implicit Diffusion Models for Continuous Super-Resolution

Arxiv

0+阅读 · 2023年3月29日

Improving Visual Representation Learning through Perceptual Understanding

Arxiv

0+阅读 · 2023年3月28日

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts

Arxiv

1+阅读 · 2023年3月28日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

Arxiv

25+阅读 · 2019年5月21日

相关基金

复方青黄散对骨髓增生异常综合征患者线粒体DNA拷贝数变异的影响及分子机理的研究

国家自然科学基金

0+阅读 · 2015年12月31日

氯离子非均匀电迁移下钢筋混凝土耐久性劣化的时空分布特征研究

国家自然科学基金

0+阅读 · 2013年12月31日

反应动力学中非绝热效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

研究Netrin-1基因对骨髓干细胞移植治疗心肌梗死的改善作用

国家自然科学基金

0+阅读 · 2012年12月31日

分子中原子的量子理论(QTAIM)用于激发态化学的研究

国家自然科学基金

0+阅读 · 2012年12月31日

中子衍射在不锈钢材料焊接应力辐照松弛行为的研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIF受体乙酰化介导的代谢异常在乳腺癌中的功能及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

靶向Bcl2的小分子抗白血病药物的开发研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于非线性激光差频理论与光腔衰荡技术痕量气体分子浓度测量和组分分析的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员