优化生成文本到图像的提示 (Optimizing Prompts for Text-to-Image Generation) - 专知论文

会员服务 ·

0

Prompt · Engineering · Performer · HTTPS · 原点 ·

2022 年 12 月 19 日

Optimizing Prompts for Text-to-Image Generation

翻译：优化生成文本到图像的提示

Yaru Hao,Zewen Chi,Li Dong,Furu Wei

from arxiv, 10 pages

Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation, a general framework that automatically adapts original user input to model-preferred prompts. Specifically, we first perform supervised fine-tuning with a pretrained language model on a small collection of manually engineered prompts. Then we use reinforcement learning to explore better prompts. We define a reward function that encourages the policy to generate more aesthetically pleasing images while preserving the original user intentions. Experimental results on Stable Diffusion show that our method outperforms manual prompt engineering in terms of both automatic metrics and human preference ratings. Moreover, reinforcement learning further boosts performance, especially on out-of-domain prompts. The pretrained checkpoints are available at https://aka.ms/promptist. The demo can be found at https://aka.ms/promptist-demo.

翻译：设计完善的提示可以引导文本到图像模型生成惊人的图像。然而, 表演的提示通常都是模型化的, 并且与用户输入不相符。我们建议快速调整, 而不是费力的人类工程, 即一个将原始用户输入自动调整到模型首选提示的总框架。具体地说, 我们首先对少量手工设计提示进行预先培训的语言模型的微调。然后我们用强化学习来探索更好的提示。我们定义了一个奖励功能, 鼓励该政策在保存原始用户意图的同时生成更美观的图像。稳定传播的实验结果显示, 我们的方法在自动计量和人类偏好评级方面都超越了人工快速工程。此外, 强化学习能进一步提升业绩, 特别是外向外的提示。预先培训的检查可在 https:// aka. ms/ promptist 上查阅。演示可以在 https:// aka. ms/ promptistem- demo 上找到。

0

相关内容

Prompt

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Klf4/MSI2信号通路在胰腺癌神经浸润中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

HMGB1在神经炎症，神经退行死亡和帕金森病形成的分子机制和神经行为学研究

国家自然科学基金

1+阅读 · 2014年12月31日

YAP2信号通路在骨肉瘤中的作用和机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

超级电容器复合电极材料的设计合成及性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

PIM-RPs信号通路在前列腺癌发生发展中作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

ARK5/p38MAPK/Pim-3信号通路在胃癌发生、发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

大承气汤调控AR42J细胞凋亡-坏死转换分子开关的转化研究

国家自然科学基金

0+阅读 · 2009年12月31日

Fine-grained Cross-modal Fusion based Refinement for Text-to-Image Synthesis

Arxiv

0+阅读 · 2023年2月17日

LEVER: Learning to Verify Language-to-Code Generation with Execution

Arxiv

0+阅读 · 2023年2月16日

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

Arxiv

0+阅读 · 2023年2月16日

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

Arxiv

0+阅读 · 2023年2月16日

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

Arxiv

0+阅读 · 2023年2月16日

Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works

Arxiv

0+阅读 · 2023年2月16日

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

Arxiv

0+阅读 · 2023年2月16日

Prompting GPT-3 To Be Reliable

Arxiv

0+阅读 · 2023年2月15日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

用于无人机的C波段空地通信系统研究 | 2025最新116页

甚高频军事战术通信系统传播性能分析研究

军事通信系统：安全行动的支柱

卫星与地面通信系统：美陆军面临的空间与电子战局势 | 39页报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Fine-grained Cross-modal Fusion based Refinement for Text-to-Image Synthesis

Arxiv

0+阅读 · 2023年2月17日

LEVER: Learning to Verify Language-to-Code Generation with Execution

Arxiv

0+阅读 · 2023年2月16日

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

Arxiv

0+阅读 · 2023年2月16日

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

Arxiv

0+阅读 · 2023年2月16日

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

Arxiv

0+阅读 · 2023年2月16日

Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works

Arxiv

0+阅读 · 2023年2月16日

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

Arxiv

0+阅读 · 2023年2月16日

Prompting GPT-3 To Be Reliable

Arxiv

0+阅读 · 2023年2月15日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

相关基金

Klf4/MSI2信号通路在胰腺癌神经浸润中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

HMGB1在神经炎症，神经退行死亡和帕金森病形成的分子机制和神经行为学研究

国家自然科学基金

1+阅读 · 2014年12月31日

YAP2信号通路在骨肉瘤中的作用和机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

超级电容器复合电极材料的设计合成及性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

肝癌细胞膜蛋白cytokeratin-1用于肝癌在体分子显像和靶向治疗的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

PIM-RPs信号通路在前列腺癌发生发展中作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

ARK5/p38MAPK/Pim-3信号通路在胃癌发生、发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

大承气汤调控AR42J细胞凋亡-坏死转换分子开关的转化研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员