Hawkes processes have been shown to be efficient in modeling bursty sequences in a variety of applications, such as finance and social network activity analysis. Traditionally, these models parameterize each process independently and assume that the history of each point process can be fully observed. Such models could however be inefficient or even prohibited in certain real-world applications, such as in the field of education, where such assumptions are violated. Motivated by the problem of detecting and predicting student procrastination in students Massive Open Online Courses (MOOCs) with missing and partially observed data, in this work, we propose a novel personalized Hawkes process model (RCHawkes-Gamma) that discovers meaningful student behavior clusters by jointly learning all partially observed processes simultaneously, without relying on auxiliary features. Our experiments on both synthetic and real-world education datasets show that RCHawkes-Gamma can effectively recover student clusters and their temporal procrastination dynamics, resulting in better predictive performance of future student activities. Our further analyses of the learned parameters and their association with student delays show that the discovered student clusters unveil meaningful representations of various procrastination behaviors in students.
翻译:在金融和社会网络活动分析等各种应用中,霍克斯进程在模拟爆发序列方面证明是有效的,在财务和社会网络活动分析等各种应用中,这些模型在传统上是独立地对每个过程进行参数化的,并假定每个点过程的历史都能够得到完全的观察。然而,这些模型在某些现实世界应用中,如在教育领域,这些假设被违反,可能是效率低下的,甚至被禁止的。在发现和预测学生在大量公开在线课程(MOCs)中拖延学习缺失和部分观察数据的问题的推动下,我们在这项工作中提出了一个新的个性化的霍克斯进程模型(RCHawkes-Gamma),通过同时学习所有部分观察过程,发现有意义的学生行为集群,而不依赖辅助特征。我们在合成和现实世界教育数据集方面的实验表明,RCHawkes-Gamma能够有效地恢复学生群及其时间拖延动态,从而更好地预测未来学生活动的绩效。我们进一步分析学到的参数及其与学生延迟的关系表明,所发现的学生群群群展示了学生各种拖延行为的有意义表现。