TextDiffuser: Diffusion Models as Text Painters - 专知论文

会员服务 ·

0

MoDELS · Prompt · HTTPS · 数据集 · 图像修复 ·

2023 年 5 月 18 日

TextDiffuser: Diffusion Models as Text Painters

翻译：暂无翻译

Jingye Chen,Yupan Huang,Tengchao Lv,Lei Cui,Qifeng Chen,Furu Wei

Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce \textbf{TextDiffuser}, focusing on generating images with visually appealing text that is coherent with backgrounds. TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout. Additionally, we contribute the first large-scale text images dataset with OCR annotations, \textbf{MARIO-10M}, containing 10 million image-text pairs with text recognition, detection, and character-level segmentation annotations. We further collect the \textbf{MARIO-Eval} benchmark to serve as a comprehensive tool for evaluating text rendering quality. Through experiments and user studies, we show that TextDiffuser is flexible and controllable to create high-quality text images using text prompts alone or together with text template images, and conduct text inpainting to reconstruct incomplete images with text. The code, model, and dataset will be available at \url{https://aka.ms/textdiffuser}.

翻译：暂无翻译

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ChatGPT系列报告】ChatGPT插件(Plugin)开启AI操作系统，28页pdf

【ChatGPT系列报告】ChatGPT插件(Plugin)开启AI操作系统，28页pdf

专知会员服务

115+阅读 · 2023年3月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

组蛋白修饰酶SETD2功能缺失促进MLL白血病发生的表观遗传调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

土著产黄青霉浸出修复重金属重污染土壤机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号通路在MSCs对COPD上皮细胞修复中的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

多类型时序逻辑程序设计

国家自然科学基金

0+阅读 · 2013年12月31日

电涡流脉冲热成像中缺陷快速检测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

电荷有序态的像差矫正电子显微学研究

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

花叶矢竹叶色条纹变异的分子机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

EMCD和ALCHEMI研究单个DMS纳米结构的铁磁性内禀属性

国家自然科学基金

0+阅读 · 2009年12月31日

RGC-32参与TGF-β#35825;导肾小管上皮向间充质细胞转化的分子调控机制

国家自然科学基金

0+阅读 · 2008年12月31日

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

Arxiv

0+阅读 · 2023年7月5日

Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality

Arxiv

1+阅读 · 2023年7月5日

On the Adversarial Robustness of Generative Autoencoders in the Latent Space

Arxiv

0+阅读 · 2023年7月5日

Prompting Diffusion Representations for Cross-Domain Semantic Segmentation

Arxiv

0+阅读 · 2023年7月5日

MERGE: Fast Private Text Generation

Arxiv

0+阅读 · 2023年7月2日

AMD: Autoregressive Motion Diffusion

Arxiv

0+阅读 · 2023年7月2日

On the Reliability of Watermarks for Large Language Models

Arxiv

0+阅读 · 2023年6月30日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

VIP会员

文章信息

相关主题

相关VIP内容

【ChatGPT系列报告】ChatGPT插件(Plugin)开启AI操作系统，28页pdf

【ChatGPT系列报告】ChatGPT插件(Plugin)开启AI操作系统，28页pdf

专知会员服务

115+阅读 · 2023年3月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机蜂群在模拟战斗环境中对任务效能的影响》50页

《第一人称视角武装无人机的作战飞行艺术与科学》报告

工程视角：影响战争进程的小型无人机

《乌克兰的战术侦察打击：对美国陆军启示》报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

Arxiv

0+阅读 · 2023年7月5日

Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality

Arxiv

1+阅读 · 2023年7月5日

On the Adversarial Robustness of Generative Autoencoders in the Latent Space

Arxiv

0+阅读 · 2023年7月5日

Prompting Diffusion Representations for Cross-Domain Semantic Segmentation

Arxiv

0+阅读 · 2023年7月5日

MERGE: Fast Private Text Generation

Arxiv

0+阅读 · 2023年7月2日

AMD: Autoregressive Motion Diffusion

Arxiv

0+阅读 · 2023年7月2日

On the Reliability of Watermarks for Large Language Models

Arxiv

0+阅读 · 2023年6月30日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

相关基金

组蛋白修饰酶SETD2功能缺失促进MLL白血病发生的表观遗传调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

土著产黄青霉浸出修复重金属重污染土壤机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号通路在MSCs对COPD上皮细胞修复中的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

多类型时序逻辑程序设计

国家自然科学基金

0+阅读 · 2013年12月31日

电涡流脉冲热成像中缺陷快速检测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

电荷有序态的像差矫正电子显微学研究

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

花叶矢竹叶色条纹变异的分子机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

EMCD和ALCHEMI研究单个DMS纳米结构的铁磁性内禀属性

国家自然科学基金

0+阅读 · 2009年12月31日

RGC-32参与TGF-β#35825;导肾小管上皮向间充质细胞转化的分子调控机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员