机器学习算法在识别学生通过学期课程的概率方面的预测模型 (A Predictive Model using Machine Learning Algorithm in Identifying Students Probability on Passing Semestral Course)

This study aims to determine a predictive model to learn students probability to pass their courses taken at the earliest stage of the semester. To successfully discover a good predictive model with high acceptability, accurate, and precision rate which delivers a useful outcome for decision making in education systems, in improving the processes of conveying knowledge and uplifting students academic performance, the proponent applies and strictly followed the CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology. This study employs classification for data mining techniques, and decision tree for algorithm. With the utilization of the newly discovered predictive model, the prediction of students probabilities to pass the current courses they take gives 0.7619 accuracy, 0.8333 precision, 0.8823 recall, and 0.8571 f1 score, which shows that the model used in the prediction is reliable, accurate, and recommendable. Considering the indicators and the results, it can be noted that the prediction model used in this study is highly acceptable. The data mining techniques provides effective and efficient innovative tools in analyzing and predicting student performances. The model used in this study will greatly affect the way educators understand and identify the weakness of their students in the class, the way they improved the effectiveness of their learning processes gearing to their students, bring down academic failure rates, and help institution administrators modify their learning system outcomes. Further study for the inclusion of some students demographic information, vast amount of data within the dataset, automated and manual process of predictive criteria indicators where the students can regulate to which criteria, they must improve more for them to pass their courses taken at the end of the semester as early as midterm period are highly needed.

翻译：本研究旨在确定一种预测模型，以了解学生在学期初阶段通过所选课程的概率。为了成功地发现具有高接受度、准确率和精度的良好预测模型，该研究应用并严格遵循跨行业数据挖掘标准过程-CRISP-DM方法。本研究采用分类数据挖掘技术和决策树算法。利用新发现的预测模型，对学生通过当前课程的概率进行预测，得到0.7619的准确率、0.8333的精度、0.8823的召回率和0.8571的f1分数，表明所使用的模型可靠、准确且值得推荐。考虑这些指标和成果，可以得出本研究采用的预测模型具有很高的接受性。数据挖掘技术为分析和预测学生表现提供了有效和高效的创新工具。本研究所使用的模型将极大地影响教育者了解和识别班级中学生的弱点的方式，改进他们的学习过程的有效性，减少学术失误率，并帮助机构管理员修改他们的学习系统结果。需要进一步研究包括一些学生的人口统计信息、数据集内大量的数据、自动化和手动化的预测标准指标过程，使学生可以调整为哪些标准，他们必须更多地改进，以在学期末早期通过他们所选的课程。