使用Vanilla ViT主干骨和MAE预培训前确认面貌表现特征 (Facial Expression Recognition using Vanilla ViT backbones with MAE Pretraining) - 专知论文

会员服务 ·

0

掩码自编码MAE · Backbone · Vision · 归纳偏好 · 变换 ·

2022 年 7 月 22 日

Facial Expression Recognition using Vanilla ViT backbones with MAE Pretraining

翻译：使用Vanilla ViT主干骨和MAE预培训前确认面貌表现特征

Jia Li,Ziyang Zhang

from arxiv, 3 pages

Humans usually convey emotions voluntarily or involuntarily by facial expressions. Automatically recognizing the basic expression (such as happiness, sadness, and neutral) from a facial image, i.e., facial expression recognition (FER), is extremely challenging and attracts much research interests. Large scale datasets and powerful inference models have been proposed to address the problem. Though considerable progress has been made, most of the state of the arts employing convolutional neural networks (CNNs) or elaborately modified Vision Transformers (ViTs) depend heavily on upstream supervised pretraining. Transformers are taking place the domination of CNNs in more and more computer vision tasks. But they usually need much more data to train, since they use less inductive biases compared with CNNs. To explore whether a vanilla ViT without extra training samples from upstream tasks is able to achieve competitive accuracy, we use a plain ViT with MAE pretraining to perform the FER task. Specifically, we first pretrain the original ViT as a Masked Autoencoder (MAE) on a large facial expression dataset without expression labels. Then, we fine-tune the ViT on popular facial expression datasets with expression labels. The presented method is quite competitive with 90.22\% on RAF-DB, 61.73\% on AfectNet and can serve as a simple yet strong ViT-based baseline for FER studies.

翻译：人类通常通过面部表情自动或非自愿传递情感。自动识别面部图像的基本表达方式( 如幸福、悲伤和中性), 即面部表达识别( FER) 极具挑战性, 吸引了许多研究兴趣。大规模数据集和强大的推论模型被提出来解决这个问题。尽管已经取得了相当大的进展, 但大部分使用神经神经神经网络(CNNs)或精心修改的视觉变异器(VTs)的艺术状态都严重依赖上游监管的训练前训练。变异器正在将CNN的主导性置于越来越多的计算机视觉任务中。但是他们通常需要更多数据来训练, 因为与CNN相比,它们使用的偏向偏向偏向偏向偏向偏向。要探索没有额外培训样本的Vanilla ViT能否达到竞争的准确性, 我们使用普通的ViTT, 我们首先将原始VIT作为蒙面的自动自动变换图(MAE MAE) 放在一个大型面部表达式表达式的数据中, 之后, 我们使用具有竞争力的VATDBDUD lavex lave- dal lave- dal ex lave- supals laveal laudals labals lave- supals laveals

0

相关内容

掩码自编码MAE

掩码自编码MAE

掩码自编码MAE

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

有机溶剂中离子液体与环糊精超分子自组装研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于WorldView-3和OP-ELM的矿化蚀变提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

泛素蛋白酶体通路基因SNPs与晚期食管鳞癌紫杉醇敏感性

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于格子Boltzmann方法的粘弹流体挤出胀大及多相流动的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国碎米蕨类（cheilanthoid ferns）的系统学研究

国家自然科学基金

0+阅读 · 2012年12月31日

臭氧-水解酸化污泥原位减量工艺同步脱氮收磷特性及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

海洋吡咯生物碱的设计、合成与活性研究

国家自然科学基金

0+阅读 · 2011年12月31日

survivin拮抗细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Panoramic Vision Transformer for Saliency Detection in 360° Videos

Arxiv

0+阅读 · 2022年9月19日

Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation

Arxiv

0+阅读 · 2022年9月19日

Continuously Controllable Facial Expression Editing in Talking Face Videos

Arxiv

0+阅读 · 2022年9月17日

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition

Arxiv

0+阅读 · 2022年9月16日

Enhance the Visual Representation via Discrete Adversarial Training

Arxiv

0+阅读 · 2022年9月16日

A Light Recipe to Train Robust Vision Transformers

Arxiv

0+阅读 · 2022年9月15日

Finetuning Pretrained Vision-Language Models with Correlation Information Bottleneck for Robust Visual Question Answering

Arxiv

0+阅读 · 2022年9月14日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

NDDR-CNN: Layer-wise Feature Fusing in Multi-Task CNN by Neural Discriminative Dimensionality Reduction

Arxiv

15+阅读 · 2018年1月25日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

掩码自编码MAE

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

三维高斯泼溅应用综述：分割、编辑与生成

《多智能体不确定环境追逃博弈研究》216页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Panoramic Vision Transformer for Saliency Detection in 360° Videos

Arxiv

0+阅读 · 2022年9月19日

Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation

Arxiv

0+阅读 · 2022年9月19日

Continuously Controllable Facial Expression Editing in Talking Face Videos

Arxiv

0+阅读 · 2022年9月17日

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition

Arxiv

0+阅读 · 2022年9月16日

Enhance the Visual Representation via Discrete Adversarial Training

Arxiv

0+阅读 · 2022年9月16日

A Light Recipe to Train Robust Vision Transformers

Arxiv

0+阅读 · 2022年9月15日

Finetuning Pretrained Vision-Language Models with Correlation Information Bottleneck for Robust Visual Question Answering

Arxiv

0+阅读 · 2022年9月14日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

NDDR-CNN: Layer-wise Feature Fusing in Multi-Task CNN by Neural Discriminative Dimensionality Reduction

Arxiv

15+阅读 · 2018年1月25日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

有机溶剂中离子液体与环糊精超分子自组装研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于WorldView-3和OP-ELM的矿化蚀变提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

泛素蛋白酶体通路基因SNPs与晚期食管鳞癌紫杉醇敏感性

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于格子Boltzmann方法的粘弹流体挤出胀大及多相流动的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国碎米蕨类（cheilanthoid ferns）的系统学研究

国家自然科学基金

0+阅读 · 2012年12月31日

臭氧-水解酸化污泥原位减量工艺同步脱氮收磷特性及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

海洋吡咯生物碱的设计、合成与活性研究

国家自然科学基金

0+阅读 · 2011年12月31日

survivin拮抗细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员