Educational process data, i.e., logs of detailed student activities in computerized or online learning platforms, has the potential to offer deep insights into how students learn. One can use process data for many downstream tasks such as learning outcome prediction and automatically delivering personalized intervention. However, analyzing process data is challenging since the specific format of process data varies a lot depending on different learning/testing scenarios. In this paper, we propose a framework for learning representations of educational process data that is applicable across many different learning scenarios. Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data and a fine-tuning step that further adjusts these representations on downstream prediction tasks. We apply our framework to the 2019 nation's report card data mining competition dataset that consists of student problem-solving process data and detail the specific models we use in this scenario. We conduct both quantitative and qualitative experiments to show that our framework results in process data representations that are both predictive and informative.
翻译:教育过程数据,即计算机化或在线学习平台中详细的学生活动日志,有可能对学生的学习方式提供深刻的洞察力。我们可以将过程数据用于许多下游任务,例如学习结果预测和自动提供个性化干预。然而,分析过程数据具有挑战性,因为过程数据的具体格式因不同的学习/测试设想而有很大差异。在本文件中,我们提出了一个学习过程数据表的学习框架,适用于许多不同的学习设想。我们的框架包括一个培训前步骤,利用BERT类型的目标从连续过程数据中学习说明情况,以及一个微调步骤,进一步调整下游预测任务中的这些说明。我们对2019年国家报告卡数据挖掘竞争数据集采用了我们的框架,该数据集包括学生解决问题的过程数据,并详细说明我们在这种设想中使用的具体模型。我们进行定量和定性实验,以表明我们的框架在过程数据表述中的结果是预测性和信息性的。