带有变换器的可缩放扩散模型 (Scalable Diffusion Models with Transformers) - 专知论文

会员服务 ·

0

变换 · MoDELS · 潜在 · U-Net · state-of-the-art ·

2022 年 12 月 19 日

Scalable Diffusion Models with Transformers

翻译：带有变换器的可缩放扩散模型

William Peebles,Saining Xie

from arxiv, Code, project page and videos available at https://www.wpeebles.com/DiT

We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches. We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops. We find that DiTs with higher Gflops -- through increased transformer depth/width or increased number of input tokens -- consistently have lower FID. In addition to possessing good scalability properties, our largest DiT-XL/2 models outperform all prior diffusion models on the class-conditional ImageNet 512x512 and 256x256 benchmarks, achieving a state-of-the-art FID of 2.27 on the latter.

翻译：我们探索基于变压器结构的新型扩散模型。我们训练了图像的潜在扩散模型, 将常用的U- Net主干网替换成在潜伏补丁上运行的变压器。我们通过Gflops测量的远端传变复杂度透镜分析我们的变射变异器的可缩放性。我们发现, Gflops 高的二二T公司 -- -- 通过提高变压器深度/宽度或增加输入符号的数量 -- -- 始终拥有较低的FID。除了拥有良好的可缩放性外,我们最大的二T- XL/2 模型比级条件图像网 512x512 和 256x256 基准上的所有先前扩散模型都更优异, 在后者上实现了2. 27 的最新FID。

1

相关内容

Transformer如何做扩散模型？伯克利最新《transformer可扩展扩散模型》论文

Transformer如何做扩散模型？伯克利最新《transformer可扩展扩散模型》论文

专知会员服务

88+阅读 · 2022年12月22日

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

炸药快速检测用纳米胶体阵列的设计与制备

国家自然科学基金

0+阅读 · 2015年12月31日

阵列摆式波浪发电装置的性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于微结构演化的金属材料塑性本构模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型BODIPY-茚酮类近红外荧光染料的合成与光物理性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

超高强钢热成形损伤控制与引导车身抗撞零件性能优化设计研究

国家自然科学基金

1+阅读 · 2012年12月31日

miRNASNPs调控鸡miRNA生物合成及功能的机制

国家自然科学基金

0+阅读 · 2012年12月31日

火灾下相贯焊接钢管节点力学性能和破坏机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

双极性树枝状蓝光PhOLED用Ir（Ⅲ）金属配合物的合成与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

有机配位键合的新型多元金属硫属化物的合成及其性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

Like a Good Nearest Neighbor: Practical Content Moderation with Sentence Transformers

Arxiv

0+阅读 · 2023年2月17日

Video Probabilistic Diffusion Models in Projected Latent Space

Arxiv

2+阅读 · 2023年2月15日

Score-based Diffusion Models in Function Space

Arxiv

0+阅读 · 2023年2月14日

Diffusion Models in Vision: A Survey

Arxiv

30+阅读 · 2022年9月10日

Multimodal Learning with Transformers: A Survey

Arxiv

69+阅读 · 2022年6月13日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

Transformer如何做扩散模型？伯克利最新《transformer可扩展扩散模型》论文

Transformer如何做扩散模型？伯克利最新《transformer可扩展扩散模型》论文

专知会员服务

88+阅读 · 2022年12月22日

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Like a Good Nearest Neighbor: Practical Content Moderation with Sentence Transformers

Arxiv

0+阅读 · 2023年2月17日

Video Probabilistic Diffusion Models in Projected Latent Space

Arxiv

2+阅读 · 2023年2月15日

Score-based Diffusion Models in Function Space

Arxiv

0+阅读 · 2023年2月14日

Diffusion Models in Vision: A Survey

Arxiv

30+阅读 · 2022年9月10日

Multimodal Learning with Transformers: A Survey

Arxiv

69+阅读 · 2022年6月13日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

相关基金

炸药快速检测用纳米胶体阵列的设计与制备

国家自然科学基金

0+阅读 · 2015年12月31日

阵列摆式波浪发电装置的性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于微结构演化的金属材料塑性本构模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型BODIPY-茚酮类近红外荧光染料的合成与光物理性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

超高强钢热成形损伤控制与引导车身抗撞零件性能优化设计研究

国家自然科学基金

1+阅读 · 2012年12月31日

miRNASNPs调控鸡miRNA生物合成及功能的机制

国家自然科学基金

0+阅读 · 2012年12月31日

火灾下相贯焊接钢管节点力学性能和破坏机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

双极性树枝状蓝光PhOLED用Ir（Ⅲ）金属配合物的合成与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

有机配位键合的新型多元金属硫属化物的合成及其性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员