人工智能辅助手写普通化学考试评分 (Assisting the Grading of a Handwritten General Chemistry Exam with Artificial Intelligence)

We explore the effectiveness and reliability of an artificial intelligence (AI)-based grading system for a handwritten general chemistry exam, comparing AI-assigned scores to human grading across various types of questions. Exam pages and grading rubrics were uploaded as images to account for chemical reaction equations, short and long open-ended answers, numerical and symbolic answer derivations, drawing, and sketching in pencil-and-paper format. Using linear regression analyses and psychometric evaluations, the investigation reveals high agreement between AI and human graders for textual and chemical reaction questions, while highlighting lower reliability for numerical and graphical tasks. The findings emphasize the necessity for human oversight to ensure grading accuracy, based on selective filtering. The results indicate promising applications for AI in routine assessment tasks, though careful consideration must be given to student perceptions of fairness and trust in integrating AI-based grading into educational practice.

翻译：本研究探讨了基于人工智能（AI）的手写普通化学考试评分系统的有效性和可靠性，通过比较AI评分与人工评分在不同题型中的表现。考试试卷和评分标准以图像形式上传，以涵盖化学反应方程式、开放式简答与详答题、数值与符号推导题、绘图及铅笔手绘草图等纸质考试内容。通过线性回归分析和心理测量学评估，研究发现AI与人工评分者在文本题和化学反应题上具有高度一致性，但在数值题和图形题上的可靠性较低。研究结果强调，基于选择性筛选的人工监督对于确保评分准确性是必要的。这些结果表明AI在常规评估任务中具有广阔的应用前景，但在将AI评分融入教育实践时，必须审慎考虑学生对公平性的认知和对系统的信任度。

相关内容

关注 7072

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日