Fatigue detection is valued for people to keep mental health and prevent safety accidents. However, detecting facial fatigue, especially mild fatigue in the real world via machine vision is still a challenging issue due to lack of non-lab dataset and well-defined algorithms. In order to improve the detection capability on facial fatigue that can be used widely in daily life, this paper provided an audiovisual dataset named DLFD (daily-life fatigue dataset) which reflected people's facial fatigue state in the wild. A framework using 3D-ResNet along with non-local attention mechanism was training for extraction of local and long-range features in spatial and temporal dimensions. Then, a compacted loss function combining mean squared error and cross-entropy was designed to predict both continuous and categorical fatigue degrees. Our proposed framework has reached an average accuracy of 90.8% on validation set and 72.5% on test set for binary classification, standing a good position compared to other state-of-the-art methods. The analysis of feature map visualization revealed that our framework captured facial dynamics and attempted to build a connection with fatigue state. Our experimental results in multiple metrics proved that our framework captured some typical, micro and dynamic facial features along spatiotemporal dimensions, contributing to the mild fatigue detection in the wild.
翻译:然而,由于缺少非实验室数据集和定义明确的算法,检测面部疲劳,尤其是通过机器在现实世界中轻微疲劳,仍然是一个具有挑战性的问题。为了提高在日常生活中可以广泛使用的面部疲劳的检测能力,本文件提供了一个名为DLFD(日常生活疲劳数据集)的视听数据集(每日疲劳数据集),该数据集反映了野外人们的面部疲劳状态。一个使用3D-ResNet和非当地关注机制的框架,是利用空间和时间方面的地方和远程特征的培训。然后,一个结合平均正方形错误和交叉耐受性算法的紧凑损失功能,目的是预测连续和绝对疲劳程度。我们提议的框架在验证集上达到平均准确度90.8%,在二元分类测试集上达到72.5%,与其他最先进的方法相比,处于良好位置。对地貌图进行的分析表明,我们的框架收集了面部动态动态,并试图与疲劳状态建立联系。我们用多种度尺度的实验性结果证明,我们在典型的气质变框架中采集了某些典型的微生物。