Research on automated essay scoring has become increasing important because it serves as a method for evaluating students' written-responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods than can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern automated essay scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included as part of the Automated Student Assessment Prize competition that were then classified using a scoring model that was training with the bidirectional encoder representations from transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.
翻译:自动化文章评分的研究变得越来越重要,因为它作为一种方法来评估学生的书面反应。随着学生转移到在线学习环境,需要评估大量的书面反应评估,因此需要可扩展的得分方法。本研究旨在描述并评估三种主动学习方法,这些方法可用于最小化必须由人类评分员评分的论文数量,同时仍提供训练现代自动化论文评分系统所需的数据。三种主动学习方法是基于不确定性的方法,基于拓扑的方法和混合方法。这三种方法被用于选择包含在自动化学生评估奖竞赛中的文章,随后使用以双向编码器表示transformer语言模型(BERT)的模型进行分类。所有三种主动学习方法都产生了强大的结果,其中基于拓扑结构的方法产生了最有效的分类。增长率精度也进行了评估。不同样本大小分配下,主动学习方法产生了不同程度的效率,但总体上,所有三种方法都非常高效,并产生了相似的分类。