This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis. Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load such as faces gestures (e.g., blink rate, facial actions units) and user actions (e.g., head pose, distance to the camera). The multimodal system uses the following modules based on Convolutional Neural Networks (CNNs): Eye blink detection, head pose estimation, facial landmark detection, and facial expression features. First, we individually evaluate the proposed modules in the task of estimating the student's attention level captured during online e-learning sessions. For that we trained binary classifiers (high or low attention) based on Support Vector Machines (SVM) for each module. Secondly, we find out to what extent multimodal score level fusion improves the attention level estimation. The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment that contains data from 38 users while conducting several e-learning tasks of variable difficulty (creating changes in student cognitive loads).
翻译:这项工作提出了基于多式联运面貌分析的远程关注估计新多式联运系统; 我们的多式联运方法使用了与模拟认知负荷(例如,闪烁率、面部动作单位)和用户动作(例如,头部姿势、距离镜头)等模拟认知负荷有关的行为和生理过程的不同参数和信号; 多式联运系统使用基于进化神经网络(CNNs)的以下模块:眼眨探测、头部姿势估计、面部标志检测和面部表达特征。 首先,我们单独评估了在估算在线电子学习课程中学生关注水平的任务中的拟议模块。 对于我们根据支持矢量机(SVM)对每个模块进行的培训的二进制分类(高低关注度或低关注)。 其次,我们发现多式联运分数水平在多大程度上改善了关注水平估计。 MEBAL数据库用于实验框架,一个公共多模式数据库,用于在电子学习环境中进行关注水平估计,该数据库包含38个用户的数据,同时进行若干电子学习困难的电子学习任务(学生认知负荷的变化)。