When students write programs, their program structure provides insight into their learning process. However, analyzing program structure by hand is time-consuming, and teachers need better tools for computer-assisted exploration of student solutions. As a first step towards an education-oriented program analysis toolkit, we show how supervised machine learning methods can automatically classify student programs into a predetermined set of high-level structures. We evaluate two models on classifying student solutions to the Rainfall problem: a nearest-neighbors classifier using syntax tree edit distance and a recurrent neural network. We demonstrate that these models can achieve 91% classification accuracy when trained on 108 programs. We further explore the generality, trade-offs, and failure cases of each model.
翻译:当学生写入程序时,他们的程序结构能洞察他们的学习过程。然而,用手分析程序结构很费时,教师需要更好的计算机辅助探索学生解决方案的工具。作为面向教育的方案分析工具包的第一步,我们展示监督的机器学习方法如何将学生方案自动分类为一套预定的高级结构。我们评估了两种学生解决降雨问题的方法分类模式:一种是使用通税树编辑距离的近邻分类器,一种是经常性神经网络。我们证明这些模型在接受108个程序的培训时可以达到91%的分类准确性。我们进一步探索每种模式的一般性、权衡和失败案例。