高分辨率图像合成的坦坦变形器 (Taming Transformers for High-Resolution Image Synthesis) - 专知论文

会员服务 ·

0

变换 · 归纳偏好 · MoDELS · 有偏 · Continuity ·

2021 年 6 月 23 日

Taming Transformers for High-Resolution Image Synthesis

翻译：高分辨率图像合成的坦坦变形器

Patrick Esser,Robin Rombach,Björn Ommer

from arxiv, Changelog can be found in the supplementary

Designed to learn long-range interactions on sequential data, transformers continue to show state-of-the-art results on a wide variety of tasks. In contrast to CNNs, they contain no inductive bias that prioritizes local interactions. This makes them expressive, but also computationally infeasible for long sequences, such as high-resolution images. We demonstrate how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images. We show how to (i) use CNNs to learn a context-rich vocabulary of image constituents, and in turn (ii) utilize transformers to efficiently model their composition within high-resolution images. Our approach is readily applied to conditional synthesis tasks, where both non-spatial information, such as object classes, and spatial information, such as segmentations, can control the generated image. In particular, we present the first results on semantically-guided synthesis of megapixel images with transformers and obtain the state of the art among autoregressive models on class-conditional ImageNet. Code and pretrained models can be found at https://github.com/CompVis/taming-transformers .

翻译：变压器旨在学习关于相继数据的长距离互动。变压器在设计上继续显示关于一系列广泛任务的最新艺术结果。与有线电视新闻网相比,它们并不包含任何优先进行本地互动的感化偏差。这使得它们具有表达性,但也在计算上对长序列不可行,例如高分辨率图像。我们展示了CNN的感应偏差的有效性如何与变压器的直观性结合起来,使他们能够模拟并从而合成高分辨率图像。我们展示了如何(一)使用CNN来学习内容丰富的图像成份词汇,并反过来(二)利用变压器在高分辨率图像中有效地模拟其构成。我们的方法很容易适用于有条件的合成任务,在这种情况下,非空间信息(如对象类别)和空间信息(如分解)能够控制生成的图像。我们特别介绍了关于变压器巨型像图像的由静态制导导合成的第一个结果,并获得了在等级偏移图像网络上自动反向模型中的艺术状态。代码和预设的模型可以在 http://comgistransrefressreportstraction/reportstraction。

0

相关内容

自然语言处理现代方法，176页pdf

自然语言处理现代方法，176页pdf

专知会员服务

269+阅读 · 2021年2月22日

【ICLR2021】彩色化变换器，Colorization Transformer

【ICLR2021】彩色化变换器，Colorization Transformer

专知会员服务

10+阅读 · 2021年2月9日

【EMNLP2020】高性能自然语言处理，274页ppt详述最新Transformer等技术进展

【EMNLP2020】高性能自然语言处理，274页ppt详述最新Transformer等技术进展

专知会员服务

61+阅读 · 2020年11月21日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020-小鹏汽车】判别性多模态语音识别, Discriminative Multi-modality SR

【CVPR2020-小鹏汽车】判别性多模态语音识别, Discriminative Multi-modality SR

专知会员服务

41+阅读 · 2020年5月13日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Efficient Transformer for Single Image Super-Resolution

Arxiv

0+阅读 · 2021年8月25日

SwinIR: Image Restoration Using Swin Transformer

Arxiv

0+阅读 · 2021年8月23日

Colorization Transformer

Arxiv

9+阅读 · 2021年2月8日

AdderSR: Towards Energy Efficient Image Super-Resolution

AdderSR: Towards Energy Efficient Image Super-Resolution

Arxiv

9+阅读 · 2020年9月18日

Residual Non-local Attention Networks for Image Restoration

Arxiv

9+阅读 · 2019年3月24日

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Arxiv

3+阅读 · 2019年2月28日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Large Scale GAN Training for High Fidelity Natural Image Synthesis

Arxiv

5+阅读 · 2018年9月28日

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Arxiv

3+阅读 · 2018年8月20日

VIP会员

文章信息

相关主题

相关VIP内容

自然语言处理现代方法，176页pdf

自然语言处理现代方法，176页pdf

专知会员服务

269+阅读 · 2021年2月22日

【ICLR2021】彩色化变换器，Colorization Transformer

【ICLR2021】彩色化变换器，Colorization Transformer

专知会员服务

10+阅读 · 2021年2月9日

【EMNLP2020】高性能自然语言处理，274页ppt详述最新Transformer等技术进展

【EMNLP2020】高性能自然语言处理，274页ppt详述最新Transformer等技术进展

专知会员服务

61+阅读 · 2020年11月21日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020-小鹏汽车】判别性多模态语音识别, Discriminative Multi-modality SR

【CVPR2020-小鹏汽车】判别性多模态语音识别, Discriminative Multi-modality SR

专知会员服务

41+阅读 · 2020年5月13日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Efficient Transformer for Single Image Super-Resolution

Arxiv

0+阅读 · 2021年8月25日

SwinIR: Image Restoration Using Swin Transformer

Arxiv

0+阅读 · 2021年8月23日

Colorization Transformer

Arxiv

9+阅读 · 2021年2月8日

AdderSR: Towards Energy Efficient Image Super-Resolution

AdderSR: Towards Energy Efficient Image Super-Resolution

Arxiv

9+阅读 · 2020年9月18日

Residual Non-local Attention Networks for Image Restoration

Arxiv

9+阅读 · 2019年3月24日

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Arxiv

3+阅读 · 2019年2月28日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Large Scale GAN Training for High Fidelity Natural Image Synthesis

Arxiv

5+阅读 · 2018年9月28日

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Arxiv

3+阅读 · 2018年8月20日

微信扫码咨询专知VIP会员