利用多任务多模式多模式神经融合,从教育数据库中进行学生保留风险分析 (College Student Retention Risk Analysis From Educational Database using Multi-Task Multi-Modal Neural Fusion)

We develop a Multimodal Spatiotemporal Neural Fusion network for Multi-Task Learning (MSNF-MTCL) to predict 5 important students' retention risks: future dropout, next semester dropout, type of dropout, duration of dropout and cause of dropout. First, we develop a general purpose multi-modal neural fusion network model MSNF for learning students' academic information representation by fusing spatial and temporal unstructured advising notes with spatiotemporal structured data. MSNF combines a Bidirectional Encoder Representations from Transformers (BERT)-based document embedding framework to represent each advising note, Long-Short Term Memory (LSTM) network to model temporal advising note embeddings, LSTM network to model students' temporal performance variables and students' static demographics altogether. The final fused representation from MSNF has been utilized on a Multi-Task Cascade Learning (MTCL) model towards building MSNF-MTCL for predicting 5 student retention risks. We evaluate MSNFMTCL on a large educational database consists of 36,445 college students over 18 years period of time that provides promising performances comparing with the nearest state-of-art models. Additionally, we test the fairness of such model given the existence of biases.

翻译：我们开发了一个多任务学习的多模式超时神经融合网络(MSNF-MTCL),以预测5个重要的学生留校风险:未来辍学、下学期辍学、辍学类型、辍学时间和辍学原因。首先,我们开发了一个通用的多模式神经融合网络模型(MSNF),用于学习学生的学术信息代表性,方法是利用时空无结构的数据,提供空间和时空无结构的咨询说明。MSNF将基于变换器的文件嵌入的双向连锁演示框架(BERT)合在一起,代表每个通知注,长短期记忆(LSTM)网络建模时间咨询嵌入说明,LSTM网络建模学生时间性表现变量和学生静态人口统计模型。来自MSNF的最后集成演示模式(MTCL)已经用于建设MSNF-MMTCL,以预测5学生留校风险。我们评估了由36 445名大学生组成的大型教育数据库的MSNFMTCL,长期记忆(LTMM)网络,以模拟时间间隔模型为18年的模型,提供了最有希望的成绩。