Expressing and identifying emotions through facial and physical expressions is a significant part of social interaction. Emotion recognition is an essential task in computer vision due to its various applications and mainly for allowing a more natural interaction between humans and machines. The common approaches for emotion recognition focus on analyzing facial expressions and requires the automatic localization of the face in the image. Although these methods can correctly classify emotion in controlled scenarios, such techniques are limited when dealing with unconstrained daily interactions. We propose a new deep learning approach for emotion recognition based on adaptive multi-cues that extract information from context and body poses, which humans commonly use in social interaction and communication. We compare the proposed approach with the state-of-art approaches in the CAER-S dataset, evaluating different components in a pipeline that reached an accuracy of 89.30%
翻译:通过面部和身体表达表达和识别情感是社会互动的一个重要部分。情感识别是计算机视觉中的一项基本任务,因为其各种应用,主要是允许人与机器之间更自然的互动。情感识别的共同方法侧重于分析面部表情,要求图像脸部的自动定位。虽然这些方法可以正确地将情感分类为受控制的情景,但在处理不受限制的日常互动时,这些技术是有限的。我们提出了一种新的深层次的情感识别学习方法,其基础是适应性多要素,从上下文和身体构成中提取信息,这是人类在社会互动和交流中通常使用的信息。我们比较了拟议的方法和CAER-S数据集中最先进的方法,对达到89.30%准确度的管道中的不同组成部分进行了评估。