Choosing the right and effective way to assess students is one of the most important tasks of higher education. Many studies have shown that students tend to receive higher scores during their studies when assessed by different study methods which include units that are fully assessed by varying the duration of study or a combination of courses and exams than by exams alone. Many Educational Data Mining studies process data in advance through traditional data extraction, including the data preparation process. In this paper, we propose a different data preparation process by investigating more than 230000 student records for the preparation of scores. The data have been processed through diverse stages in order to extract a categorical factor through which students module marks are refined during the data preparation stage. The results of this work show that students final marks should not be isolated from the nature of the enrolled module assessment methods. They must rather be investigated thoroughly and considered during EDM data preprocessing stage. More generally, educational data should not be prepared in the same way normal data are due to the differences in data sources, applications, and error types. The effect of Module Assessment Index on the prediction process using Random Forest and Naive Bayes classification techniques were investigated. It was shown that considering MAI as attribute increases the accuracy of predicting students second year averages based on their first year averages.
翻译:许多研究显示,学生的最后成绩不应与注册模块评估方法的性质隔开,而是必须在EDM数据预处理阶段进行彻底调查和考虑,更一般地说,教育数据不应以正常数据方式编制正常数据,因为数据来源、应用和差错类型的差异。 调查模块评估指数对使用随机森林和耐维湾分类技术进行预测的过程的影响首先得到了调查。