The assessment of program functionality can generally be accomplished with straight-forward unit tests. However, assessing the design quality of a program is a much more difficult and nuanced problem. Design quality is an important consideration since it affects the readability and maintainability of programs. Assessing design quality and giving personalized feedback is very time consuming task for instructors and teaching assistants. This limits the scale of giving personalized feedback to small class settings. Further, design quality is nuanced and is difficult to concisely express as a set of rules. For these reasons, we propose a neural network model to both automatically assess the design of a program and provide personalized feedback to guide students on how to make corrections. The model's effectiveness is evaluated on a corpus of student programs written in Python. The model has an accuracy rate from 83.67% to 94.27%, depending on the dataset, when predicting design scores as compared to historical instructor assessment. Finally, we present a study where students tried to improve the design of their programs based on the personalized feedback produced by the model. Students who participated in the study improved their program design scores by 19.58%.
翻译:评估程序功能通常可以通过直向式单元测试完成。 但是,评估程序的设计质量是一个困难和细微的问题。 设计质量是一个重要的考虑因素,因为它影响到程序的可读性和可维护性。 评估设计质量和提供个性化反馈对于教员和教学助理来说是非常耗时的任务。 这限制了向小班级提供个性化反馈的规模。 此外,设计质量是细微的,很难用一套规则简洁地表达出来。 出于这些原因,我们提议一个神经网络模型,以便自动评估程序的设计并提供个性化反馈,指导学生如何进行校正。模型的有效性在用Python书写的一套学生方案中得到评估。模型的精确率从83.67%到94.27%不等,取决于数据集,在预测与历史教员评估相比的设计分数时。 最后,我们提出一项研究,学生试图根据模型产生的个性化反馈改进方案的设计。参加研究的学生将其方案设计分数改进了19.58%。