Student engagement is a key construct for learning and teaching. While most of the literature explored the student engagement analysis on computer-based settings, this paper extends that focus to classroom instruction. To best examine student visual engagement in the classroom, we conducted a study utilizing the audiovisual recordings of classes at a secondary school over one and a half month's time, acquired continuous engagement labeling per student (N=15) in repeated sessions, and explored computer vision methods to classify engagement levels from faces in the classroom. We trained deep embeddings for attentional and emotional features, training Attention-Net for head pose estimation and Affect-Net for facial expression recognition. We additionally trained different engagement classifiers, consisting of Support Vector Machines, Random Forest, Multilayer Perceptron, and Long Short-Term Memory, for both features. The best performing engagement classifiers achieved AUCs of .620 and .720 in Grades 8 and 12, respectively. We further investigated fusion strategies and found score-level fusion either improves the engagement classifiers or is on par with the best performing modality. We also investigated the effect of personalization and found that using only 60-seconds of person-specific data selected by margin uncertainty of the base classifier yielded an average AUC improvement of .084. 4.Our main aim with this work is to provide the technical means to facilitate the manual data analysis of classroom videos in research on teaching quality and in the context of teacher training.
翻译:虽然大多数文献探讨了计算机环境下的学生参与分析,但本文将重点扩展至课堂教学。为了最好地审查课堂上的学生视觉参与情况,我们开展了一项研究,利用一个半月以上中学班级的视听记录,在多次课中获得每个学生(N=15)的连续参与标签,并探索了计算机愿景方法,以区分课堂上接触水平;我们为关注和情感特征进行了深入嵌入培训,为头部的注意网进行了估计,为面部表达识别提供了Affect-Net。我们还培训了不同的参与分类人员,包括支持矢控机器、随机森林、多层 Perceptron和长时段记忆,就这两个特点开展了一项研究;最佳参与分类人员在8年级和12年级分别获得了620和720澳洲AUC的连续参与标签;我们进一步调查了融合战略,发现分级教师的融合,或者改进了参与分类,或者与最佳表现模式相匹配。我们还调查了个人化的影响,并发现,使用只有60-08个支持对象机群、随机、高档、高档的教学质量分析,根据具体数据分析,提供了具体数据分析,以提供。