Initial assessment tests are crucial in capturing learner knowledge states in a consistent manner. Aside from crafting questions itself, putting together relevant problems to form a question sheet is also a time-consuming process. In this work, we present a generic formulation of question assembly and a genetic algorithm based method that can generate assessment tests from raw problem-solving history. First, we estimate the learner-question knowledge matrix (snapshot). Each matrix element stands for the probability that a learner correctly answers a specific question. We formulate the task as a combinatorial search over this snapshot. To ensure representative and discriminative diagnostic tests, questions are selected (1) that has a low root mean squared error against the whole question pool and (2) high standard deviation among learner performances. Experimental results show that the proposed method outperforms greedy and random baseline by a large margin in one private dataset and four public datasets. We also performed qualitative analysis on the generated assessment test for 9th graders, which enjoys good problem scatterness across the whole 9th grader curriculum and decent difficulty level distribution.
翻译:初步评估测试对于以一致的方式捕捉学习者的知识状态至关重要。除了设计问题本身之外,将相关问题汇集成问题表本身也是一个耗时的过程。在这项工作中,我们提出了一个问题组和基于遗传算法的通用方法,从解决问题的原始历史中产生评估测试。首先,我们估算了学习者-问题知识矩阵(Snapshot),每个矩阵元素代表学习者正确回答具体问题的概率。我们把这个任务设计成对这个快照的组合搜索。为了确保有代表性和歧视性的诊断测试,我们选择了:(1) 对整个问题库存在低根平均正方形错误的问题,和(2) 学习者性能存在高标准偏差的问题。实验结果显示,拟议方法在一个私人数据集和四个公共数据集中大幅度地超越贪婪和随机基线。我们还对生成的九年级学生评估测试进行了定性分析,他们在整个九年级课程和适当难度水平分布中都拥有良好的问题分散性。