Light field (LF) cameras provide rich spatio-angular visual representations by sensing the visual scene from multiple perspectives and have recently emerged as a promising technology to boost the performance of human-machine systems such as biometrics and affective computing. Despite the significant success of LF representation for constrained facial image analysis, this technology has never been used for face and expression recognition in the wild. In this context, this paper proposes a new deep face and expression recognition solution, called CapsField, based on a convolutional neural network and an additional capsule network that utilizes dynamic routing to learn hierarchical relations between capsules. CapsField extracts the spatial features from facial images and learns the angular part-whole relations for a selected set of 2D sub-aperture images rendered from each LF image. To analyze the performance of the proposed solution in the wild, the first in the wild LF face dataset, along with a new complementary constrained face dataset captured from the same subjects recorded earlier have been captured and are made available. A subset of the in the wild dataset contains facial images with different expressions, annotated for usage in the context of face expression recognition tests. An extensive performance assessment study using the new datasets has been conducted for the proposed and relevant prior solutions, showing that the CapsField proposed solution achieves superior performance for both face and expression recognition tasks when compared to the state-of-the-art.
翻译:光场( LF) 相机通过从多个角度对视觉场景进行感知,提供了丰富的孔径视觉图象,从多个角度对视觉场景进行感知,最近成为提升人体机器系统(如生物鉴别和感官计算)性能的有希望的技术。尽管LF的面部图像分析非常成功,但这一技术从未用于野外的面部和表情识别。在此背景下,本文件提出一个新的深度面部和表情识别解决方案,称为Caps Fire,它基于一个卷发神经网络和一个额外的胶囊网络,它利用动态路径学习胶囊之间的等级关系。 Caps Fereal从面部图像中提取空间特征,并学习每张面部图像所选一组 2D 子孔图象的角半整体关系。为了分析野外拟议解决方案的性能和表现,野外的LF脸部数据集首个,已经捕获并提供了一个新的互补的面部限制面部数据集。野外数据集的一个子集包含不同表达式的面部图像,对面部图像作了说明,在脸面部图像的用途上作了说明,并在前面面部表现识别前测试中进行了高级表现识别。