越南的子宫表现特征识别和图像描述生成 (Facial Expression Recognition and Image Description Generation in Vietnamese) - 专知论文

会员服务 ·

0

YOLOv5 · MoDELS · CNN · Color · 模型评估 ·

2022 年 8 月 12 日

Facial Expression Recognition and Image Description Generation in Vietnamese

翻译：越南的子宫表现特征识别和图像描述生成

Khang Nhut Lam,Kim-Ngoc Thi Nguyen,Loc Huu Nguy,Jugal Kalita

from arxiv, 7 pages

This paper discusses a facial expression recognition model and a description generation model to build descriptive sentences for images and facial expressions of people in images. Our study shows that YOLOv5 achieves better results than a traditional CNN for all emotions on the KDEF dataset. In particular, the accuracies of the CNN and YOLOv5 models for emotion recognition are 0.853 and 0.938, respectively. A model for generating descriptions for images based on a merged architecture is proposed using VGG16 with the descriptions encoded over an LSTM model. YOLOv5 is also used to recognize dominant colors of objects in the images and correct the color words in the descriptions generated if it is necessary. If the description contains words referring to a person, we recognize the emotion of the person in the image. Finally, we combine the results of all models to create sentences that describe the visual content and the human emotions in the images. Experimental results on the Flickr8k dataset in Vietnamese achieve BLEU-1, BLEU-2, BLEU-3, BLEU-4 scores of 0.628; 0.425; 0.280; and 0.174, respectively.

翻译：本文讨论面部表达识别模型和描述生成模型, 以构建图像中人们的图像和面部表达的描述性句子。我们的研究显示, YOLOv5 取得了比传统CNN更好的效果, 包括 KDEF 数据集中的所有情感。特别是CNN 和 YOLOv5 情感识别模型的精度分别为 0.853 和 0. 938 。使用 VGG16 和 LSTM 模型的编码描述来生成图像描述模型的图像描述模型。 YOLOv5 还用于识别图像中物体的主要颜色, 并在必要时纠正描述中生成的颜色词。如果描述包含指一个人的词, 我们就会识别图像中的人的情感。最后, 我们将所有模型的结果结合起来, 来创建描述图像中视觉内容和人类情感的句子。越南FlFlick8k数据集的实验结果分别达到 BLEU-1、 BLEU-2、 BLEU-2、 BLEU-3、 BLEU-4分0.628、 0.4 0. 42; 0. 0. 0. 和0. 0. 280; 0. 0. 和0. 0. 0. 0. 和0. 0. 0. 0. 和0. 0. 774 。

0

相关内容

YOLOv5

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

量子码的构造

国家自然科学基金

1+阅读 · 2015年12月31日

基于压路机—土体非线性系统动力响应的压实质量连续检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

氧气对木质纤维素来源抑制物的生物脱毒促进机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

人长非编码ACAT1C7基因组结构与表达调控及功能

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

CKS1基因影响鼻咽癌细胞增殖和侵袭的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

抗氧化蛋白Trx2在脂联素抑制肝细胞癌过程中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类四阶MEMS方程的解集结构与解的渐近性态

国家自然科学基金

0+阅读 · 2011年12月31日

玉米45S rDNA脆性位点的表观遗传机制及遗传进化

国家自然科学基金

0+阅读 · 2011年12月31日

RI作用ILK介导的信号转导通路在膀胱癌侵袭转移中的机制研究

国家自然科学基金

1+阅读 · 2010年12月31日

Imagen Video: High Definition Video Generation with Diffusion Models

Arxiv

0+阅读 · 2022年10月5日

Implicit Warping for Animation with Image Sets

Arxiv

0+阅读 · 2022年10月4日

Music-to-Text Synaesthesia: Generating Descriptive Text from Music Recordings

Arxiv

0+阅读 · 2022年10月2日

AudioGen: Textually Guided Audio Generation

Arxiv

0+阅读 · 2022年9月30日

Diffusion-based Image Translation using Disentangled Style and Content Representation

Arxiv

0+阅读 · 2022年9月30日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

VIP会员

文章信息

相关主题

相关VIP内容

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

相关论文

Imagen Video: High Definition Video Generation with Diffusion Models

Arxiv

0+阅读 · 2022年10月5日

Implicit Warping for Animation with Image Sets

Arxiv

0+阅读 · 2022年10月4日

Music-to-Text Synaesthesia: Generating Descriptive Text from Music Recordings

Arxiv

0+阅读 · 2022年10月2日

AudioGen: Textually Guided Audio Generation

Arxiv

0+阅读 · 2022年9月30日

Diffusion-based Image Translation using Disentangled Style and Content Representation

Arxiv

0+阅读 · 2022年9月30日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

相关基金

量子码的构造

国家自然科学基金

1+阅读 · 2015年12月31日

基于压路机—土体非线性系统动力响应的压实质量连续检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

氧气对木质纤维素来源抑制物的生物脱毒促进机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

人长非编码ACAT1C7基因组结构与表达调控及功能

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

CKS1基因影响鼻咽癌细胞增殖和侵袭的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

抗氧化蛋白Trx2在脂联素抑制肝细胞癌过程中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类四阶MEMS方程的解集结构与解的渐近性态

国家自然科学基金

0+阅读 · 2011年12月31日

玉米45S rDNA脆性位点的表观遗传机制及遗传进化

国家自然科学基金

0+阅读 · 2011年12月31日

RI作用ILK介导的信号转导通路在膀胱癌侵袭转移中的机制研究

国家自然科学基金

1+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员