学习用于图像描述的指南代号 (Learning to Guide Decoding for Image Captioning) - 专知论文

会员服务 ·

0

图像字幕 · Extensibility · Networking · 解码 · 学成 ·

2018 年 4 月 3 日

Learning to Guide Decoding for Image Captioning

翻译：学习用于图像描述的指南代号

Wenhao Jiang,Lin Ma,Xinpeng Chen,Hanwang Zhang,Wei Liu

from arxiv, AAAI-18

Recently, much advance has been made in image captioning, and an encoder-decoder framework has achieved outstanding performance for this task. In this paper, we propose an extension of the encoder-decoder framework by adding a component called guiding network. The guiding network models the attribute properties of input images, and its output is leveraged to compose the input of the decoder at each time step. The guiding network can be plugged into the current encoder-decoder framework and trained in an end-to-end manner. Hence, the guiding vector can be adaptively learned according to the signal from the decoder, making itself to embed information from both image and language. Additionally, discriminative supervision can be employed to further improve the quality of guidance. The advantages of our proposed approach are verified by experiments carried out on the MS COCO dataset.

翻译：最近,在图像字幕方面已经取得了很大进展,而编码器-编码器-编码器框架已经取得了这一任务的杰出业绩。在本文件中,我们建议通过添加一个名为指导网络的组件来扩展编码器-编码器框架。指导网络模型是输入图像的属性,其输出被利用来组成每个步骤的解码器输入。指导网络可以插进目前的编码器-编码器框架,并以端到端的方式进行培训。因此,指导矢量可以根据解码器的信号来适应性地学习,从而将图像和语言的信息都嵌入其中。此外,还可以利用歧视性的监督来进一步提高指导质量。我们拟议方法的优点可以通过在MS COCO数据集上进行的实验来验证。

6

相关内容

图像字幕

图像字幕（Image Captioning）,是指从图像生成文本描述的过程，主要根据图像中物体和物体的动作。

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

深度学习生物图像重建综述，Deep Learning for Biomedical Image Reconstruction: A Survey

深度学习生物图像重建综述，Deep Learning for Biomedical Image Reconstruction: A Survey

专知会员服务

40+阅读 · 2020年3月2日

【图像分割| 2019最新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍，附35页PDF（Machine Learning Techniques for Biomedical Image Segmentation）

【图像分割| 2019最新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍，附35页PDF（Machine Learning Techniques for Biomedical Image Segmentation）

专知会员服务

49+阅读 · 2019年11月16日

【英国帝国理工学院】心脏图像分割的深度学习:综述，47页pdf，Deep learning for cardiac image segmentation: A review

【英国帝国理工学院】心脏图像分割的深度学习:综述，47页pdf，Deep learning for cardiac image segmentation: A review

专知会员服务

56+阅读 · 2019年11月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Hierarchy Parsing for Image Captioning

Hierarchy Parsing for Image Captioning

Arxiv

6+阅读 · 2019年9月10日

Neural Image Captioning

Neural Image Captioning

Arxiv

5+阅读 · 2019年7月2日

Image Captioning: Transforming Objects into Words

Image Captioning: Transforming Objects into Words

Arxiv

7+阅读 · 2019年6月14日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

Improving Image Captioning with Conditional Generative Adversarial Nets

Arxiv

9+阅读 · 2018年5月18日

End-to-End Video Captioning with Multitask Reinforcement Learning

Arxiv

5+阅读 · 2018年3月21日

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

Arxiv

6+阅读 · 2018年3月14日

Where to put the Image in an Image Caption Generator

Arxiv

3+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

相关VIP内容

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

深度学习生物图像重建综述，Deep Learning for Biomedical Image Reconstruction: A Survey

深度学习生物图像重建综述，Deep Learning for Biomedical Image Reconstruction: A Survey

专知会员服务

40+阅读 · 2020年3月2日

【图像分割| 2019最新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍，附35页PDF（Machine Learning Techniques for Biomedical Image Segmentation）

【图像分割| 2019最新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍，附35页PDF（Machine Learning Techniques for Biomedical Image Segmentation）

专知会员服务

49+阅读 · 2019年11月16日

【英国帝国理工学院】心脏图像分割的深度学习:综述，47页pdf，Deep learning for cardiac image segmentation: A review

【英国帝国理工学院】心脏图像分割的深度学习:综述，47页pdf，Deep learning for cardiac image segmentation: A review

专知会员服务

56+阅读 · 2019年11月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中俄罗斯两栖作战能力：黑海舰队战力与作战失利研究》2025年最新111页

《任务式指挥十六个案例研究》232页

《实现多层防御多轮交战机制的扩展型随机齐射模型》2025年最新83页

《美军条令：小部队指挥官山地作战指南》最新238页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Hierarchy Parsing for Image Captioning

Hierarchy Parsing for Image Captioning

Arxiv

6+阅读 · 2019年9月10日

Neural Image Captioning

Neural Image Captioning

Arxiv

5+阅读 · 2019年7月2日

Image Captioning: Transforming Objects into Words

Image Captioning: Transforming Objects into Words

Arxiv

7+阅读 · 2019年6月14日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

Improving Image Captioning with Conditional Generative Adversarial Nets

Arxiv

9+阅读 · 2018年5月18日

End-to-End Video Captioning with Multitask Reinforcement Learning

Arxiv

5+阅读 · 2018年3月21日

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

Arxiv

6+阅读 · 2018年3月14日

Where to put the Image in an Image Caption Generator

Arxiv

3+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员