NWWA-LIP:无缺陷的VQGAN用语言引导图像油漆 (NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN) - 专知论文

会员服务 ·

0

图像修复 · INFORMS · Performer · Guidance · 估计/估计量 ·

2022 年 2 月 10 日

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

翻译：NWWA-LIP:无缺陷的VQGAN用语言引导图像油漆

Minheng Ni,Chenfei Wu,Haoyang Huang,Daxin Jiang,Wangmeng Zuo,Nan Duan

from arxiv, 10 pages, 6 figures

Language guided image inpainting aims to fill in the defective regions of an image under the guidance of text while keeping non-defective regions unchanged. However, the encoding process of existing models suffers from either receptive spreading of defective regions or information loss of non-defective regions, giving rise to visually unappealing inpainting results. To address the above issues, this paper proposes N\"UWA-LIP by incorporating defect-free VQGAN (DF-VQGAN) with multi-perspective sequence to sequence (MP-S2S). In particular, DF-VQGAN introduces relative estimation to control receptive spreading and adopts symmetrical connections to protect information. MP-S2S further enhances visual information from complementary perspectives, including both low-level pixels and high-level tokens. Experiments show that DF-VQGAN performs more robustness than VQGAN. To evaluate the inpainting performance of our model, we built up 3 open-domain benchmarks, where N\"UWA-LIP is also superior to recent strong baselines.

翻译：语言指导图像绘制的目的是在文字指导下填充有缺陷的图像区域,同时保持不偏差区域不变;然而,现有模型的编码过程要么是接受有缺陷区域的传播,要么是不偏差区域的信息丢失,从而产生视觉上不吸引的油漆结果;为了解决上述问题,本文件提议N\“UWA-LIP”,将无缺陷VQGAN(DF-VQGAN)纳入多视序列(MP-S2S),从而将无缺陷VQGAN(DF-VQGAN)纳入多视谱序列(MP-S2S),从而让DF-VQGAN(DF-VQGAN)引入相对估计,以控制接受传播和采用对称连接来保护信息。MPS-2S进一步从互补的角度加强视觉信息,包括低级别的像素和高标记。实验表明,DF-VQGAN比VQAN(VQGAN)更可靠。为了评估我们模型的内窥度性性性业绩,我们建立了3个开放域基准,N\UWA-LIP(UIP)也高于最近的强基线。

0

相关内容

图像修复

图像修复（英语：Inpainting）指重建的图像和视频中丢失或损坏的部分的过程。例如在博物馆中，这项工作常由经验丰富的博物馆管理员或者艺术品修复师来进行。数码世界中，图像修复又称图像插值或视频插值，指利用复杂的算法来替换已丢失、损坏的图像数据，主要替换一些小区域和瑕疵。

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【推荐论文】多通道注意力选择GAN的图像到图像转换，Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation

【推荐论文】多通道注意力选择GAN的图像到图像转换，Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation

专知会员服务

30+阅读 · 2020年2月6日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

redox信号介导的6-BA调控黄瓜弱光适应性的生理与分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

苹果转录因子MdWRKY33与MdVQ相互作用抗炭疽叶枯病的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体诱导环糊精修饰石墨烯/铁氧化物对放射性核素吸附及其机理研究

国家自然科学基金

2+阅读 · 2012年12月31日

有限角逆向螺旋锥束CT扫描与图像重建

国家自然科学基金

0+阅读 · 2012年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

以Her2/neu为靶点的新型VLP疫苗免疫应答研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

NLS1介导的转录因子Arx在细胞核定位的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

大视场双螺旋锥束CT扫描与重建

国家自然科学基金

0+阅读 · 2009年12月31日

基于超视锐度机理的图像超分辨率重构

国家自然科学基金

0+阅读 · 2008年12月31日

DAM-GAN : Image Inpainting using Dynamic Attention Map based on Fake Texture Detection

DAM-GAN : Image Inpainting using Dynamic Attention Map based on Fake Texture Detection

Arxiv

0+阅读 · 2022年4月20日

Situational Perception Guided Image Matting

Arxiv

0+阅读 · 2022年4月20日

Sketch guided and progressive growing GAN for realistic and editable ultrasound image synthesis

Arxiv

0+阅读 · 2022年4月19日

A Comprehensive Survey on Data-Efficient GANs in Image Generation

Arxiv

0+阅读 · 2022年4月18日

Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters

Arxiv

0+阅读 · 2022年4月18日

Shape-guided Object Inpainting

Arxiv

0+阅读 · 2022年4月16日

Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer

Arxiv

0+阅读 · 2022年4月15日

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Arxiv

1+阅读 · 2022年4月14日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【推荐论文】多通道注意力选择GAN的图像到图像转换，Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation

【推荐论文】多通道注意力选择GAN的图像到图像转换，Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation

专知会员服务

30+阅读 · 2020年2月6日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

DAM-GAN : Image Inpainting using Dynamic Attention Map based on Fake Texture Detection

DAM-GAN : Image Inpainting using Dynamic Attention Map based on Fake Texture Detection

Arxiv

0+阅读 · 2022年4月20日

Situational Perception Guided Image Matting

Arxiv

0+阅读 · 2022年4月20日

Sketch guided and progressive growing GAN for realistic and editable ultrasound image synthesis

Arxiv

0+阅读 · 2022年4月19日

A Comprehensive Survey on Data-Efficient GANs in Image Generation

Arxiv

0+阅读 · 2022年4月18日

Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters

Arxiv

0+阅读 · 2022年4月18日

Shape-guided Object Inpainting

Arxiv

0+阅读 · 2022年4月16日

Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer

Arxiv

0+阅读 · 2022年4月15日

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Arxiv

1+阅读 · 2022年4月14日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

相关基金

redox信号介导的6-BA调控黄瓜弱光适应性的生理与分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

苹果转录因子MdWRKY33与MdVQ相互作用抗炭疽叶枯病的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体诱导环糊精修饰石墨烯/铁氧化物对放射性核素吸附及其机理研究

国家自然科学基金

2+阅读 · 2012年12月31日

有限角逆向螺旋锥束CT扫描与图像重建

国家自然科学基金

0+阅读 · 2012年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

以Her2/neu为靶点的新型VLP疫苗免疫应答研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

NLS1介导的转录因子Arx在细胞核定位的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

大视场双螺旋锥束CT扫描与重建

国家自然科学基金

0+阅读 · 2009年12月31日

基于超视锐度机理的图像超分辨率重构

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员