以视觉变异器和有选择性的注意力聚合物识别法 (Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion) - 专知论文

会员服务 ·

0

Performer · 注意力机制 · 变换 · Extensibility · INFORMS ·

2022 年 2 月 22 日

Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion

翻译：以视觉变异器和有选择性的注意力聚合物识别法

Fuyan Ma,Bin Sun,Shutao Li

Facial Expression Recognition (FER) in the wild is extremely challenging due to occlusions, variant head poses, face deformation and motion blur under unconstrained conditions. Although substantial progresses have been made in automatic FER in the past few decades, previous studies were mainly designed for lab-controlled FER. Real-world occlusions, variant head poses and other issues definitely increase the difficulty of FER on account of these information-deficient regions and complex backgrounds. Different from previous pure CNNs based methods, we argue that it is feasible and practical to translate facial images into sequences of visual words and perform expression recognition from a global perspective. Therefore, we propose the Visual Transformers with Feature Fusion (VTFF) to tackle FER in the wild by two main steps. First, we propose the attentional selective fusion (ASF) for leveraging two kinds of feature maps generated by two-branch CNNs. The ASF captures discriminative information by fusing multiple features with the global-local attention. The fused feature maps are then flattened and projected into sequences of visual words. Second, inspired by the success of Transformers in natural language processing, we propose to model relationships between these visual words with the global self-attention. The proposed method is evaluated on three public in-the-wild facial expression datasets (RAF-DB, FERPlus and AffectNet). Under the same settings, extensive experiments demonstrate that our method shows superior performance over other methods, setting new state of the art on RAF-DB with 88.14%, FERPlus with 88.81% and AffectNet with 61.85%. The cross-dataset evaluation on CK+ shows the promising generalization capability of the proposed method.

翻译：野外的偏差表达度识别(FER)非常困难, 原因有二: 与以前纯净CNN使用的方法不同, 我们争辩说, 将面部图像转换成视觉文字序列, 从全球角度进行表达识别是可行和切合实际的。因此, 我们建议具有功能变异的视觉变异器(VTFF) 以两个主要步骤在野外处理 FER。首先, 我们提议有选择性的聚合(ASF), 以利用由两处CNN制作的两种功能图。 ASF 利用基于信息失密区域和复杂背景的通用图解。与以前纯净CNN使用的方法不同, 我们争辩说, 将面部图像转换成视觉文字序列, 从全球角度进行表达。因此, 我们建议具有功能变异功能变异的视觉变变器(VTFF), 以两种主要步骤解决野外变异的FERF。

0

相关内容

Performer

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

专知会员服务

68+阅读 · 2022年3月29日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于“骨肉不相亲”理论探讨壮骨方通过瘦素、Ghrelin、肽YY干预老年性骨质疏松小鼠作用机制

国家自然科学基金

0+阅读 · 2015年12月31日

Ta2O5-WO3-RxOy系统相关系及TaW基抗氧化合金组分优化

国家自然科学基金

0+阅读 · 2014年12月31日

网络机器人系统协同定位、标定与建图问题解耦及算法实现

国家自然科学基金

0+阅读 · 2013年12月31日

基于多尺度各向异性方向导数核的图象角点检测和分类理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于粒子滤波的航空发动机气路部件突变故障诊断方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

超低分辨率人脸识别

国家自然科学基金

0+阅读 · 2011年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

ForCES传输映射层(TML)关键技术问题研究

国家自然科学基金

0+阅读 · 2009年12月31日

线性微分-差分系统求解及分解的机械化算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Arxiv

0+阅读 · 2022年4月20日

A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

Arxiv

0+阅读 · 2022年4月20日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Nested Collaborative Learning for Long-Tailed Visual Recognition

Arxiv

0+阅读 · 2022年4月19日

CTCNet: A CNN-Transformer Cooperation Network for Face Image Super-Resolution

Arxiv

0+阅读 · 2022年4月19日

ActAR: Actor-Driven Pose Embeddings for Video Action Recognition

Arxiv

0+阅读 · 2022年4月19日

Transformer-based Multimodal Information Fusion for Facial Expression Analysis

Transformer-based Multimodal Information Fusion for Facial Expression Analysis

Arxiv

0+阅读 · 2022年4月18日

Dynamic Position Encoding for Transformers

Arxiv

1+阅读 · 2022年4月18日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

专知会员服务

68+阅读 · 2022年3月29日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美军小型无人机项目

无人机蜂群——作为执行非常规战争的创新工具 | 2025最新文献

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

接纳无人机多样性：西方军事在无人机战争中适应的五个挑战 | 28页报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Arxiv

0+阅读 · 2022年4月20日

A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

Arxiv

0+阅读 · 2022年4月20日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Nested Collaborative Learning for Long-Tailed Visual Recognition

Arxiv

0+阅读 · 2022年4月19日

CTCNet: A CNN-Transformer Cooperation Network for Face Image Super-Resolution

Arxiv

0+阅读 · 2022年4月19日

ActAR: Actor-Driven Pose Embeddings for Video Action Recognition

Arxiv

0+阅读 · 2022年4月19日

Transformer-based Multimodal Information Fusion for Facial Expression Analysis

Transformer-based Multimodal Information Fusion for Facial Expression Analysis

Arxiv

0+阅读 · 2022年4月18日

Dynamic Position Encoding for Transformers

Arxiv

1+阅读 · 2022年4月18日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

相关基金

基于“骨肉不相亲”理论探讨壮骨方通过瘦素、Ghrelin、肽YY干预老年性骨质疏松小鼠作用机制

国家自然科学基金

0+阅读 · 2015年12月31日

Ta2O5-WO3-RxOy系统相关系及TaW基抗氧化合金组分优化

国家自然科学基金

0+阅读 · 2014年12月31日

网络机器人系统协同定位、标定与建图问题解耦及算法实现

国家自然科学基金

0+阅读 · 2013年12月31日

基于多尺度各向异性方向导数核的图象角点检测和分类理论与方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于粒子滤波的航空发动机气路部件突变故障诊断方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

超低分辨率人脸识别

国家自然科学基金

0+阅读 · 2011年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

ForCES传输映射层(TML)关键技术问题研究

国家自然科学基金

0+阅读 · 2009年12月31日

线性微分-差分系统求解及分解的机械化算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员