通过图像画画的视觉提示 (Visual Prompting via Image Inpainting) - 专知论文

会员服务 ·

0

Prompt · 图像修复 · MoDELS · 未标记 · 连结 ·

2022 年 9 月 1 日

Visual Prompting via Image Inpainting

翻译：通过图像画画的视觉提示

Amir Bar,Yossi Gandelsman,Trevor Darrell,Amir Globerson,Alexei A. Efros

from arxiv, Project page: https://yossigandelsman.github.io/visual_prompt

How does one adapt a pre-trained visual model to novel downstream tasks without task-specific finetuning or any model modification? Inspired by prompting in NLP, this paper investigates visual prompting: given input-output image example(s) of a new task at test time and a new input image, the goal is to automatically produce the output image, consistent with the given examples. We show that posing this problem as simple image inpainting - literally just filling in a hole in a concatenated visual prompt image - turns out to be surprisingly effective, provided that the inpainting algorithm has been trained on the right data. We train masked auto-encoders on a new dataset that we curated - 88k unlabeled figures from academic papers sources on Arxiv. We apply visual prompting to these pretrained models and demonstrate results on various downstream image-to-image tasks, including foreground segmentation, single object detection, colorization, edge detection, etc.

翻译：在NLP的启发下,本文对视觉提示进行了研究:在测试时给新任务输入-输出图像示例和一个新的输入图像中,目标是根据给定实例自动生成输出图像。我们显示,将这一问题作为简单的图像涂色 — — 完全只是填充在相连接的视觉快速图像中的洞穴 — — 呈现出惊人的效果,只要油漆算法已经接受了关于正确数据的培训。我们用新数据集对蒙面自动编码进行了培训,我们从Arxiv的学术论文来源整理了 - 88k 无标签的数字。我们对这些预先训练的模型应用视觉提示,并展示了各种下游图像到图像任务的结果,包括地表分割、单个物体探测、颜色化、边缘探测等。

1

相关内容

Prompt

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

ZEALER订阅号

0+阅读 · 2022年1月27日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

围产期奶牛能量负平衡引发胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于多资源视角的柔性任务集装箱接驳（Drayage）运输的调度方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多因素耦合的激光熔覆层应力及应力损伤的超声波评价物理机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于VIP/PKA/AQP5通路的石荠苧总黄酮抗流感病毒作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

爆破开挖扰动下柱状节理玄武岩的损伤机制

国家自然科学基金

0+阅读 · 2012年12月31日

Puma和Bim在慢性淋巴细胞白血病细胞凋亡中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

丁型肝炎病毒基因组RNA及MALAT-1内源性非编码RNA在肝细胞癌病因学中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

骨钙素介导胰岛素抵抗机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微生物天然产物IMB0004和IMB0034抗HIV-1作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

Imagic: Text-Based Real Image Editing with Diffusion Models

Arxiv

0+阅读 · 2022年10月17日

Realizing Flame State Monitoring with Very Few Visual or Infrared Images via Few-Shot Learning

Arxiv

0+阅读 · 2022年10月14日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

ZEALER订阅号

0+阅读 · 2022年1月27日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Imagic: Text-Based Real Image Editing with Diffusion Models

Arxiv

0+阅读 · 2022年10月17日

Realizing Flame State Monitoring with Very Few Visual or Infrared Images via Few-Shot Learning

Arxiv

0+阅读 · 2022年10月14日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

围产期奶牛能量负平衡引发胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于多资源视角的柔性任务集装箱接驳（Drayage）运输的调度方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多因素耦合的激光熔覆层应力及应力损伤的超声波评价物理机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于VIP/PKA/AQP5通路的石荠苧总黄酮抗流感病毒作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

爆破开挖扰动下柱状节理玄武岩的损伤机制

国家自然科学基金

0+阅读 · 2012年12月31日

Puma和Bim在慢性淋巴细胞白血病细胞凋亡中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

丁型肝炎病毒基因组RNA及MALAT-1内源性非编码RNA在肝细胞癌病因学中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

骨钙素介导胰岛素抵抗机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微生物天然产物IMB0004和IMB0034抗HIV-1作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员