Traditional learning-based approaches to student modeling generalize poorly to underrepresented student groups due to biases in data availability. In this paper, we propose a methodology for predicting student performance from their online learning activities that optimizes inference accuracy over different demographic groups such as race and gender. Building upon recent foundations in federated learning, in our approach, personalized models for individual student subgroups are derived from a global model aggregated across all student models via meta-gradient updates that account for subgroup heterogeneity. To learn better representations of student activity, we augment our approach with a self-supervised behavioral pretraining methodology that leverages multiple modalities of student behavior (e.g., visits to lecture videos and participation on forums), and include a neural network attention mechanism in the model aggregation stage. Through experiments on three real-world datasets from online courses, we demonstrate that our approach obtains substantial improvements over existing student modeling baselines in predicting student learning outcomes for all subgroups. Visual analysis of the resulting student embeddings confirm that our personalization methodology indeed identifies different activity patterns within different subgroups, consistent with its stronger inference ability compared with the baselines.
翻译:在本文中,我们提出一种方法来预测学生在网上学习活动中的表现,以优化种族和性别等不同人口群体的推断准确性。 我们的方法是利用最近联邦学习的基础,在我们的方法中,个别学生分组的个性化模式来自一个全球模型,该模型通过反映分组差异性的元化更新,将所有学生模式综合起来。为了更好地了解学生活动的表现形式,我们用一种自我监督的行为前培训方法来扩大我们的方法,利用学生行为的多种模式(例如访问讲座录像和参与论坛),并将神经网络关注机制纳入模型汇总阶段。我们通过试验在线课程的三个真实世界数据集,证明我们的方法在预测所有分组学生学习成果方面比现有学生模型基线有了很大的改进。对由此形成的学生嵌入的视觉分析证实,我们的个人化方法确实在不同分组中确定了不同的活动模式,这与其与基准相比更强的推断能力是一致的。