AutoCaption: 带有神经结构搜索的图像说明 (AutoCaption: Image Captioning with Neural Architecture Search) - 专知论文

会员服务 ·

0

图像字幕 · Better · 可理解性 · 层 · RNN ·

2021 年 11 月 16 日

AutoCaption: Image Captioning with Neural Architecture Search

翻译：AutoCaption: 带有神经结构搜索的图像说明

Xinxin Zhu,Weining Wang,Longteng Guo,Jing Liu

Image captioning transforms complex visual information into abstract natural language for representation, which can help computers understanding the world quickly. However, due to the complexity of the real environment, it needs to identify key objects and realize their connections, and further generate natural language. The whole process involves a visual understanding module and a language generation module, which brings more challenges to the design of deep neural networks than other tasks. Neural Architecture Search (NAS) has shown its important role in a variety of image recognition tasks. Besides, RNN plays an essential role in the image captioning task. We introduce a AutoCaption method to better design the decoder module of the image captioning where we use the NAS to design the decoder module called AutoRNN automatically. We use the reinforcement learning method based on shared parameters for automatic design the AutoRNN efficiently. The search space of the AutoCaption includes connections between the layers and the operations in layers both, and it can make AutoRNN express more architectures. In particular, RNN is equivalent to a subset of our search space. Experiments on the MSCOCO datasets show that our AutoCaption model can achieve better performance than traditional hand-design methods.

翻译：图像字幕将复杂的视觉信息转换为抽象的自然代表语言,这可以帮助计算机快速理解世界。但是,由于真实环境的复杂性,它需要识别关键对象并实现它们的连接,并进一步生成自然语言。整个过程涉及视觉理解模块和语言生成模块,这给深神经网络的设计带来比其他任务更多的挑战。神经结构搜索(NAS)在各种图像识别任务中显示了它的重要作用。此外,NNN在图像描述任务中起着重要作用。我们引入了自动功能化方法,以更好地设计图像说明的解码模块,我们在这里使用NAS来自动设计名为AutoRNN的解码模块。我们使用基于共享参数的强化学习方法来自动设计AutoRNNN。Autaption的搜索空间包括层层层与操作之间的联系,它可以使AutoRNN(AU)表达更多的结构。特别是, RNNN相当于我们搜索空间的一部分。在MSCO数据集上的实验显示,我们的自动描述模型能够比传统的手写方法更好的性。

0

相关内容

图像字幕

图像字幕（Image Captioning）,是指从图像生成文本描述的过程，主要根据图像中物体和物体的动作。

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

专知会员服务

93+阅读 · 2020年7月10日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【论文|Google】基于元学习的排序架构，Ranking architectures using meta-learning

【论文|Google】基于元学习的排序架构，Ranking architectures using meta-learning

专知会员服务

18+阅读 · 2019年11月30日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

专知会员服务

39+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

AutoML与轻量模型大列表

AutoML与轻量模型大列表

专知

8+阅读 · 2019年4月29日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

论文 | CVPR2017有哪些值得读的Image Caption论文？

论文 | CVPR2017有哪些值得读的Image Caption论文？

黑龙江大学自然语言处理实验室

16+阅读 · 2017年12月1日

CVPR2017有哪些值得读的Image Caption论文？

CVPR2017有哪些值得读的Image Caption论文？

PaperWeekly

10+阅读 · 2017年11月29日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

Auto-GNN: Neural Architecture Search of Graph Neural Networks

Auto-GNN: Neural Architecture Search of Graph Neural Networks

Arxiv

7+阅读 · 2019年9月10日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Arxiv

5+阅读 · 2019年1月10日

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Arxiv

5+阅读 · 2018年12月26日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Arxiv

7+阅读 · 2018年5月21日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

专知会员服务

93+阅读 · 2020年7月10日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【论文|Google】基于元学习的排序架构，Ranking architectures using meta-learning

【论文|Google】基于元学习的排序架构，Ranking architectures using meta-learning

专知会员服务

18+阅读 · 2019年11月30日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

专知会员服务

39+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

AutoML与轻量模型大列表

AutoML与轻量模型大列表

专知

8+阅读 · 2019年4月29日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

论文 | CVPR2017有哪些值得读的Image Caption论文？

论文 | CVPR2017有哪些值得读的Image Caption论文？

黑龙江大学自然语言处理实验室

16+阅读 · 2017年12月1日

CVPR2017有哪些值得读的Image Caption论文？

CVPR2017有哪些值得读的Image Caption论文？

PaperWeekly

10+阅读 · 2017年11月29日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

Auto-GNN: Neural Architecture Search of Graph Neural Networks

Auto-GNN: Neural Architecture Search of Graph Neural Networks

Arxiv

7+阅读 · 2019年9月10日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Arxiv

5+阅读 · 2019年1月10日

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Arxiv

5+阅读 · 2018年12月26日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Arxiv

7+阅读 · 2018年5月21日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

微信扫码咨询专知VIP会员