Fueled by the call for formative assessments, diagnostic classification models (DCMs) have recently gained popularity in psychometrics. Despite their potential for providing diagnostic information that aids in classroom instruction and students' learning, empirical applications of DCMs to classroom assessments have been highly limited. This is partly because how DCMs with different estimation methods perform in small sample contexts is not yet well-explored. Hence, this study aims to investigate the performance of respondent classification and item parameter estimation with a comprehensive simulation design that resembles classroom assessments using different estimation methods. The key findings are the following: (1) although the marked difference in respondent classification accuracy was not observed among the maximum likelihood (ML), Bayesian, and nonparametric methods, the Bayesian method provided slightly more accurate respondent classification in parsimonious DCMs than the ML method, and in complex DCMs, the ML method yielded the slightly better result than the Bayesian method; (2) while item parameter recovery was poor in both Bayesian and ML methods, the Bayesian method exhibited unstable slip values owing to the multimodality of their posteriors under complex DCMs, and the ML method produced irregular estimates that appear to be well-estimated due to a boundary problem under parsimonious DCMs.
翻译:诊断性分类模型(DCMs)虽然具有提供有助于课堂教学和学生学习的诊断性信息的潜力,但其在课堂评估中的经验应用却非常有限,部分原因是,在小规模抽样情况下,使用不同估计方法的DCMs在小型抽样环境中的表现还没有很好地探索,因此,本研究的目的是调查答复者分类和项目参数估计的性能,采用与使用不同估计方法的课堂评估相似的综合模拟设计来调查项目分类和项目参数估计。主要结论如下:(1) 尽管在最大可能性(ML)、Bayesian和非准度方法中没有观察到应答者分类准确性方面的明显差异,但Bayesian方法在偏重的DCMMs和复杂的DCMMS中提供了略为准确的答卷人分类,而在ML方法中,ML方法得出了比Bayesian方法稍好的结果;(2)在Bayesian和ML方法中,项目参数的恢复情况都很差,但Bayesian方法显示出不稳定的滑度值,因为其远近似于复杂的DCMMS和MMs的不规则下产生的异常问题。