This paper explores the suitability of using automatically discovered topics from MOOC discussion forums for modelling students' academic abilities. The Rasch model from psychometrics is a popular generative probabilistic model that relates latent student skill, latent item difficulty, and observed student-item responses within a principled, unified framework. According to scholarly educational theory, discovered topics can be regarded as appropriate measurement items if (1) students' participation across the discovered topics is well fit by the Rasch model, and if (2) the topics are interpretable to subject-matter experts as being educationally meaningful. Such Rasch-scaled topics, with associated difficulty levels, could be of potential benefit to curriculum refinement, student assessment and personalised feedback. The technical challenge that remains, is to discover meaningful topics that simultaneously achieve good statistical fit with the Rasch model. To address this challenge, we combine the Rasch model with non-negative matrix factorisation based topic modelling, jointly fitting both models. We demonstrate the suitability of our approach with quantitative experiments on data from three Coursera MOOCs, and with qualitative survey results on topic interpretability on a Discrete Optimisation MOOC.
翻译:本文探讨利用MOOC讨论论坛自动发现的专题来模拟学生的学术能力是否合适。来自心理测量的Rasch模型是一种流行的基因概率模型,它涉及潜在的学生技能、潜在的项目困难和在原则性统一的框架内观察到的学生项目反应。根据学术教育理论,如果(1) 学生对所发现专题的参与与Rasch模型完全吻合,如果(2) 专题可以被专题专家解释为具有教育意义,那么所发现的专题可以被视为适当的衡量项目。这类按类别划分的专题可能有益于课程的改进、学生评估和个性化反馈。技术挑战仍然是发现与Rasch模型同时取得良好统计效果的有意义的专题。为了应对这一挑战,我们将Rasch模型与基于专题模型的非负矩阵因素化结合起来,共同配合两种模型。我们证明我们的方法与关于三个课程MOOC的数据的定量实验是合适的,并且对关于分散性 OOC模型的专题可解释性进行了定性调查。