利用深层学习对图像生成进行经验分析 (Empirical Analysis of Image Caption Generation using Deep Learning) - 专知论文

会员服务 ·

0

图像字幕 · Performer · Automator · 学成 · 注意力机制 ·

2021 年 5 月 22 日

Empirical Analysis of Image Caption Generation using Deep Learning

翻译：利用深层学习对图像生成进行经验分析

Aditya Bhattacharya,Eshwar Shamanna Girishekar,Padmakar Anil Deshpande

from arxiv, Withdrawing for further updates to the work

Automated image captioning is one of the applications of Deep Learning which involves fusion of work done in computer vision and natural language processing, and it is typically performed using Encoder-Decoder architectures. In this project, we have implemented and experimented with various flavors of multi-modal image captioning networks where ResNet101, DenseNet121 and VGG19 based CNN Encoders and Attention based LSTM Decoders were explored. We have studied the effect of beam size and the use of pretrained word embeddings and compared them to baseline CNN encoder and RNN decoder architecture. The goal is to analyze the performance of each approach using various evaluation metrics including BLEU, CIDEr, ROUGE and METEOR. We have also explored model explainability using Visual Attention Maps (VAM) to highlight parts of the images which has maximum contribution for predicting each word of the generated caption.

翻译：自动图像字幕是深层学习的应用之一,它涉及计算机视觉和自然语言处理方面所做工作的结合,通常使用Encoder-Decoder结构进行。在这个项目中,我们实施并试验了多种多式图像字幕网络的口味,其中探索了ResNet101、DenseNet121和VGG19的CNN Encorders和关注制LSTM 代碼器。我们研究了波束尺寸的影响以及使用预先训练的字嵌入器,并将其与CNN 编码器和 RNN 脱coder 基本结构进行比较。我们的目标是利用各种评估指标,包括BLEU、CIDER、ROOGE和METEOR,分析每种方法的性能。我们还探索了利用视觉关注图(VAM)来突出图像中对预测生成的每个词的最大贡献的部分。

0

相关内容

图像字幕

图像字幕（Image Captioning）,是指从图像生成文本描述的过程，主要根据图像中物体和物体的动作。

最新《图像描述Image Captioning》综述论文，22页pdf220篇文献

专知会员服务

43+阅读 · 2021年7月17日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【AdaMod】一个新的深度学习优化与记忆（Meet AdaMod: a new deep learning optimizer with memory）

【AdaMod】一个新的深度学习优化与记忆（Meet AdaMod: a new deep learning optimizer with memory）

专知会员服务

15+阅读 · 2020年1月13日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

CVPR2017有哪些值得读的Image Caption论文？

CVPR2017有哪些值得读的Image Caption论文？

PaperWeekly

10+阅读 · 2017年11月29日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月18日

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Arxiv

5+阅读 · 2018年12月26日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

Entity-aware Image Caption Generation

Arxiv

7+阅读 · 2018年4月21日

Where to put the Image in an Image Caption Generator

Arxiv

3+阅读 · 2018年3月14日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

最新《图像描述Image Captioning》综述论文，22页pdf220篇文献

专知会员服务

43+阅读 · 2021年7月17日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【AdaMod】一个新的深度学习优化与记忆（Meet AdaMod: a new deep learning optimizer with memory）

【AdaMod】一个新的深度学习优化与记忆（Meet AdaMod: a new deep learning optimizer with memory）

专知会员服务

15+阅读 · 2020年1月13日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

CVPR2017有哪些值得读的Image Caption论文？

CVPR2017有哪些值得读的Image Caption论文？

PaperWeekly

10+阅读 · 2017年11月29日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Arxiv

4+阅读 · 2019年1月18日

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Arxiv

5+阅读 · 2018年12月26日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

Recurrent Fusion Network for Image Captioning

Recurrent Fusion Network for Image Captioning

Arxiv

3+阅读 · 2018年7月31日

Entity-aware Image Caption Generation

Arxiv

7+阅读 · 2018年4月21日

Where to put the Image in an Image Caption Generator

Arxiv

3+阅读 · 2018年3月14日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

微信扫码咨询专知VIP会员