通过对话框进行交互式图像编辑的 GAN GAN 序列关注 (Sequential Attention GAN for Interactive Image Editing via Dialogue) - 专知论文

会员服务 ·

0

INTERACT · 任务对话系统 · Attentive GAN · 注意力机制 · GAN ·

2019 年 9 月 8 日

Sequential Attention GAN for Interactive Image Editing via Dialogue

翻译：通过对话框进行交互式图像编辑的 GAN GAN 序列关注

Yu Cheng,Zhe Gan,Yitong Li,Jingjing Liu,Jianfeng Gao

We introduce a new task - Interactive Image Editing via conversational language, where users can guide an agent to edit images via multi-turn dialogue. In each dialogue turn, the agent takes a source image and a natural language description as the input, and generates a modified image following the textual description. Two new datasets are introduced for this task (Zap-Seq and DeepFashion-Seq), which contain multi-turn dialog sessions with crowdsourced image-description sequences. The main challenges in this sequential and interactive image generation task are two-fold: 1) contextual consistency between a generated image and the given textual description; 2) step-by-step region-level modification to maintain visual consistency across the image sequence. To address these challenges, we propose a novel Sequential Attention Generative Adversarial Network (SeqAttnGAN) framework, which applies a neural state tracker to encode the previous image and the textual description in each dialogue turn, and uses a GAN framework to generate a modified version of the image that is consistent with the dialogue context and preceding images. To achieve better region-specific refinement, we also introduce a sequential attention mechanism into the model. Experiments on Zap-Seq and DeepFashion-Seq datasets show that the proposed SeqAttnGAN model outperforms state-of-the-art approaches on the interactive image editing task across all evaluation metrics on visual quality, image sequence coherence and text-image consistency.

翻译：我们引入了一个新任务 - 通过对话语言的交互式图像编辑, 用户可以在其中指导一个代理机构通过多方向对话框编辑图像。每次对话框转折时, 代理机构将源图像和自然语言描述作为输入, 并在文本描述后生成一个修改图像。为此任务引入了两个新的数据集( Zap- Seq 和 DeepFashion-Seq), 包含包含由多方源图像描述序列组成的多点对话会话。此相继和交互式图像生成任务的主要挑战有两重:1) 生成图像和给定文本描述之间的背景一致性; 2) 一步一步步的直观级修改, 以保持图像序列之间的视觉一致性。为了应对这些挑战, 我们提出了一个新的序列关注引力 Adversarial 网络 (SeqAtnGAN) 框架, 该框架将神经状态跟踪器用于对先前图像和文字描述的编码, 并使用 GAN 框架生成一个与对话框背景和前图像相一致的修改版本; 2) 一步步步的图像质量修改, 以更清晰的方式对图像进行更清晰的精确的精确的精确的精确度调整, 我们将SASqreareal- sqreal- sqreal

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

专知会员服务

45+阅读 · 2019年11月18日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

PaperWeekly

5+阅读 · 2017年7月19日

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Arxiv

7+阅读 · 2019年6月14日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

Arxiv

4+阅读 · 2018年7月29日

Compositional GAN: Learning Conditional Image Composition

Compositional GAN: Learning Conditional Image Composition

Arxiv

31+阅读 · 2018年7月19日

Joint Image Captioning and Question Answering

Arxiv

6+阅读 · 2018年5月22日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

IQA: Visual Question Answering in Interactive Environments

Arxiv

5+阅读 · 2018年4月5日

Differential Attention for Visual Question Answering

Arxiv

5+阅读 · 2018年4月3日

Dual Recurrent Attention Units for Visual Question Answering

Arxiv

7+阅读 · 2018年2月1日

VIP会员

文章信息

相关主题

任务对话系统

注意力机制

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

专知会员服务

45+阅读 · 2019年11月18日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

PaperWeekly

5+阅读 · 2017年7月19日

相关论文

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Arxiv

7+阅读 · 2019年6月14日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

Arxiv

4+阅读 · 2018年7月29日

Compositional GAN: Learning Conditional Image Composition

Compositional GAN: Learning Conditional Image Composition

Arxiv

31+阅读 · 2018年7月19日

Joint Image Captioning and Question Answering

Arxiv

6+阅读 · 2018年5月22日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

IQA: Visual Question Answering in Interactive Environments

Arxiv

5+阅读 · 2018年4月5日

Differential Attention for Visual Question Answering

Arxiv

5+阅读 · 2018年4月3日

Dual Recurrent Attention Units for Visual Question Answering

Arxiv

7+阅读 · 2018年2月1日

微信扫码咨询专知VIP会员