连续引导网络, 关注图像字幕 (A sequential guiding network with attention for image captioning) - 专知论文

会员服务 ·

0

Extensibility · Networking · 图像字幕 · 注意力机制 · Neural Networks ·

2019 年 2 月 8 日

A sequential guiding network with attention for image captioning

翻译：连续引导网络, 关注图像字幕

Daouda Sow,Zengchang Qin,Mouhamed Niasse,Tao Wan

from arxiv, 5 pages, 2 figures, 1 table, IEEE ICASSP 2019

The recent advances of deep learning in both computer vision (CV) and natural language processing (NLP) provide us a new way of understanding semantics, by which we can deal with more challenging tasks such as automatic description generation from natural images. In this challenge, the encoder-decoder framework has achieved promising performance when a convolutional neural network (CNN) is used as image encoder and a recurrent neural network (RNN) as decoder. In this paper, we introduce a sequential guiding network that guides the decoder during word generation. The new model is an extension of the encoder-decoder framework with attention that has an additional guiding long short-term memory (LSTM) and can be trained in an end-to-end manner by using image/descriptions pairs. We validate our approach by conducting extensive experiments on a benchmark dataset, i.e., MS COCO Captions. The proposed model achieves significant improvement comparing to the other state-of-the-art deep learning models.

翻译：最近在计算机视觉和自然语言处理(NLP)方面的深层次学习进展为我们提供了一种新的理解语义学的方法,通过这种方式,我们可以处理更具有挑战性的任务,例如从自然图像自动描述生成。在这项挑战中,当结合神经神经网络(CNN)被用作图像编码器和经常神经网络(RNN)用作解码器时,编码器框架取得了有希望的绩效。在本文件中,我们引入了一个连续的指导网络,指导了文字生成过程中的解码器。新模式是编码器-解码框架的延伸,它具有额外的短期内存指导性,并且可以通过使用图像/描述配对进行端端培训。我们通过在基准数据集上进行广泛的实验,即MSCOCCaptions,验证了我们的方法。与其它最先进的深层学习模式相比,拟议模型取得了显著的改进。

5

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

86+阅读 · 2020年6月23日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

人工智能 | CCF推荐期刊专刊约稿信息6条

人工智能 | CCF推荐期刊专刊约稿信息6条

Call4Papers

5+阅读 · 2019年2月18日

Image Captioning 36页最新综述， 161篇参考文献

Image Captioning 36页最新综述， 161篇参考文献

专知

90+阅读 · 2018年10月23日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

论文 | CVPR2017有哪些值得读的Image Caption论文？

论文 | CVPR2017有哪些值得读的Image Caption论文？

黑龙江大学自然语言处理实验室

16+阅读 · 2017年12月1日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Scene-based Factored Attention for Image Captioning

Arxiv

4+阅读 · 2019年8月7日

Neural Image Captioning

Neural Image Captioning

Arxiv

5+阅读 · 2019年7月2日

Multimodal Semantic Attention Network for Video Captioning

Arxiv

4+阅读 · 2019年5月8日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

Arxiv

4+阅读 · 2018年7月29日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

Improving Image Captioning with Conditional Generative Adversarial Nets

Arxiv

9+阅读 · 2018年5月18日

Learning to Guide Decoding for Image Captioning

Arxiv

6+阅读 · 2018年4月3日

VIP会员

文章信息

相关主题

注意力机制

Neural Networks

相关VIP内容

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

86+阅读 · 2020年6月23日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

自动驾驶中的3D目标检测研究进展

中文版 | 无人机战争与乌克兰战场演进（2024-2025）

【阿姆斯特丹博士论文】在嘈杂和低资源环境中提升神经检索器的鲁棒性与有效性

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

人工智能 | CCF推荐期刊专刊约稿信息6条

人工智能 | CCF推荐期刊专刊约稿信息6条

Call4Papers

5+阅读 · 2019年2月18日

Image Captioning 36页最新综述， 161篇参考文献

Image Captioning 36页最新综述， 161篇参考文献

专知

90+阅读 · 2018年10月23日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

论文 | CVPR2017有哪些值得读的Image Caption论文？

论文 | CVPR2017有哪些值得读的Image Caption论文？

黑龙江大学自然语言处理实验室

16+阅读 · 2017年12月1日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Scene-based Factored Attention for Image Captioning

Arxiv

4+阅读 · 2019年8月7日

Neural Image Captioning

Neural Image Captioning

Arxiv

5+阅读 · 2019年7月2日

Multimodal Semantic Attention Network for Video Captioning

Arxiv

4+阅读 · 2019年5月8日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

Arxiv

4+阅读 · 2018年7月29日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

Improving Image Captioning with Conditional Generative Adversarial Nets

Arxiv

9+阅读 · 2018年5月18日

Learning to Guide Decoding for Image Captioning

Arxiv

6+阅读 · 2018年4月3日

微信扫码咨询专知VIP会员