We provide a new non-invasive, easy-to-scale for large amounts of subjects and a remotely accessible method for (hidden) emotion detection from videos of human faces. Our approach combines face manifold detection for accurate location of the face in the video with local face manifold embedding to create a common domain for the measurements of muscle micro-movements that is invariant to the movement of the subject in the video. In the next step, we employ the Digital Image Speckle Correlation (DISC) and the optical flow algorithm to compute the pattern of micro-movements in the face. The corresponding vector field is mapped back to the original space and superimposed on the original frames of the videos. Hence, the resulting videos include additional information about the direction of the movement of the muscles in the face. We take the publicly available CK++ dataset of visible emotions and add to it videos of the same format but with hidden emotions. We process all the videos using our micro-movement detection and use the results to train a state-of-the-art network for emotions classification from videos -- Frame Attention Network (FAN). Although the original FAN model achieves very high out-of-sample performance on the original CK++ videos, it does not perform so well on hidden emotions videos. The performance improves significantly when the model is trained and tested on videos with the vector fields of muscle movements. Intuitively, the corresponding arrows serve as edges in the image that are easily captured by the convolutions filters in the FAN network.
翻译:我们为大量主题提供了一种新的非侵入性、容易到比例大小的方法,以及一种从人类脸部视频中进行(隐藏的)情感检测的远程可访问方法。我们的方法是将视频中面部的准确位置的多重检测与本地面部嵌入结合起来,以创建用于测量肌肉微动的通用域,而肌肉微移动与视频中该主题的移动不同。在下一步,我们使用数字图像分光镜和光学流算法来计算面部微移动的图案模式。相应的矢量场被映回原始空间,并被叠在视频的原始框架框上。因此,由此产生的视频包括更多关于脸部肌肉运动方向的信息。我们使用公开提供的CK++数据集,添加相同格式的视频,但隐藏的情感。我们用我们的微移动检测和光学流算法来处理所有视频模型,并使用结果来训练一个从视频中进行最先进的图像分类的状态网络 -- 基础关注网络, 并且通过原始的图像网络运行原始的FAN+图像, 在原始的网络上大大改进了原始的图像。