UniTune: 以单个图像微调显示图像生成模型的精美图示对文本驱动图像编辑 (UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image) - 专知论文

会员服务 ·

0

MoDELS · 逼真度 · tuning · 生成模型 · CASES ·

2022 年 10 月 19 日

UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image

翻译：UniTune: 以单个图像微调显示图像生成模型的精美图示对文本驱动图像编辑

Dani Valevski,Matan Kalman,Yossi Matias,Yaniv Leviathan

We present UniTune, a simple and novel method for general text-driven image editing. UniTune gets as input an arbitrary image and a textual edit description, and carries out the edit while maintaining high semantic and visual fidelity to the input image. UniTune uses text, an intuitive interface for art-direction, and does not require additional inputs, like masks or sketches. At the core of our method is the observation that with the right choice of parameters, we can fine-tune a large text-to-image diffusion model on a single image, encouraging the model to maintain fidelity to the input image while still allowing expressive manipulations. We used Imagen as our text-to-image model, but we expect UniTune to work with other large-scale models as well. We test our method in a range of different use cases, and demonstrate its wide applicability.

翻译：我们展示了 UniTune, 这是用于文本驱动的图像编辑的简单和新颖的方法。 UniTune 将任意的图像和文字编辑描述作为输入输入, 并在对输入图像保持高度的语义和视觉忠诚的同时进行编辑。 UniTune 使用文字, 即艺术方向的直观界面, 不需要额外的输入, 比如面罩或草图。我们方法的核心是观察, 在正确选择参数的情况下, 我们可以微调一个大型文本到图像在单一图像上的传播模型, 鼓励该模型保持对输入图像的忠诚, 同时仍然允许表达操作。我们用图像作为文本到图像的模型, 但我们期望 UniTune 与其他大型模型一起工作。我们在不同的使用案例中测试我们的方法, 并展示其广泛应用性。

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

基于神经网络的跨语言实体链指研究

国家自然科学基金

4+阅读 · 2015年12月31日

多源卫星遥感反演气溶胶光学特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

东亚东北海域云上气溶胶直接辐射强迫研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于随机博弈网的企业信息安全风险管理模型方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于视觉感知机理的林火视频识别模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

自然地表双向偏振反射的野外观测及气溶胶的卫星遥感验证

国家自然科学基金

0+阅读 · 2011年12月31日

气溶胶风化动态过程的时间和空间分辨观测

国家自然科学基金

0+阅读 · 2011年12月31日

ARK5/p38MAPK/Pim-3信号通路在胃癌发生、发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型纳米粒子运载PSMA启动子/增强子-Stat3-siRNA-GRIM-19质粒进行前列腺癌综合靶向治疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Arxiv

0+阅读 · 2022年11月30日

SSD: Towards Better Text-Image Consistency Metric in Text-to-Image Generation

Arxiv

0+阅读 · 2022年11月30日

Referring Image Matting

Arxiv

0+阅读 · 2022年11月29日

Wavelet Diffusion Models are fast and scalable Image Generators

Arxiv

0+阅读 · 2022年11月29日

Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models

Arxiv

0+阅读 · 2022年11月27日

SpaText: Spatio-Textual Representation for Controllable Image Generation

Arxiv

0+阅读 · 2022年11月25日

CHIMLE: Conditional Hierarchical IMLE for Multimodal Conditional Image Synthesis

Arxiv

0+阅读 · 2022年11月25日

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Arxiv

0+阅读 · 2022年11月25日

Sketch-Guided Text-to-Image Diffusion Models

Arxiv

1+阅读 · 2022年11月24日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】《知识图谱与大语言模型的协同应用》，544页pdf

军事通信系统：安全行动的支柱

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

【新书】机器学习系统，2620页pdf

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Arxiv

0+阅读 · 2022年11月30日

SSD: Towards Better Text-Image Consistency Metric in Text-to-Image Generation

Arxiv

0+阅读 · 2022年11月30日

Referring Image Matting

Arxiv

0+阅读 · 2022年11月29日

Wavelet Diffusion Models are fast and scalable Image Generators

Arxiv

0+阅读 · 2022年11月29日

Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models

Arxiv

0+阅读 · 2022年11月27日

SpaText: Spatio-Textual Representation for Controllable Image Generation

Arxiv

0+阅读 · 2022年11月25日

CHIMLE: Conditional Hierarchical IMLE for Multimodal Conditional Image Synthesis

Arxiv

0+阅读 · 2022年11月25日

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Arxiv

0+阅读 · 2022年11月25日

Sketch-Guided Text-to-Image Diffusion Models

Arxiv

1+阅读 · 2022年11月24日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

基于神经网络的跨语言实体链指研究

国家自然科学基金

4+阅读 · 2015年12月31日

多源卫星遥感反演气溶胶光学特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

东亚东北海域云上气溶胶直接辐射强迫研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于随机博弈网的企业信息安全风险管理模型方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于视觉感知机理的林火视频识别模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

自然地表双向偏振反射的野外观测及气溶胶的卫星遥感验证

国家自然科学基金

0+阅读 · 2011年12月31日

气溶胶风化动态过程的时间和空间分辨观测

国家自然科学基金

0+阅读 · 2011年12月31日

ARK5/p38MAPK/Pim-3信号通路在胃癌发生、发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型纳米粒子运载PSMA启动子/增强子-Stat3-siRNA-GRIM-19质粒进行前列腺癌综合靶向治疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员