Aim: To compare students' essay writing performance with or without employing ChatGPT-3 as a writing assistant tool. Materials and methods: Eighteen students participated in the study (nine in control and nine in the experimental group that used ChatGPT-3). We scored essay elements with grades (A-D) and corresponding numerical values (4-1). We compared essay scores to students' GPTs, writing time, authenticity, and content similarity. Results: Average grade was C for both groups; for control (2.39, SD=0.71) and for experimental (2.00, SD=0.73). None of the predictors affected essay scores: group (P=0.184), writing duration (P=0.669), module (P=0.388), and GPA (P=0.532). The text unauthenticity was slightly higher in the experimental group (11.87%, SD=13.45 to 9.96%, SD=9.81%), but the similarity among essays was generally low in the overall sample (the Jaccard similarity index ranging from 0 to 0.054). In the experimental group, AI classifier recognized more potential AI-generated texts. Conclusions: This study found no evidence that using GPT as a writing tool improves essay quality since the control group outperformed the experimental group in most parameters.
翻译:材料和方法: 18名学生参加了这项研究(9名在控制中,9名在控制中,9名在使用ChatGPT-3的试验组中,9名在使用ChatGPT-3的试验组中),我们以年级(A-D)和相应的数值(4-1)评分作作文元素(4-1),我们将作文评分与学生的GPT(GPT)、写时间、真实性和内容相似性进行比较。结果:两个组的平均成绩为C级;控制(2.39,SD=0.71)和实验(2.00,SD=0.73),没有一个预测者影响到作文评分:组(P=0.184),写期(P=0.669),模块(P=0.388)和GPOP(P=0.532)。 实验组的文字不准确性略高于(11.87%,SD=13.45%至9.96%,SD=9.81%),但总体样本中,作文的相似性一般较低(Jacard相似性指数为0至0.054)。 在实验组中,大赦国际分类确认可能具有更大的AI- AI- 生成的文本。结论: 本组没有改进过使用GPTMAsurgsmasurgsurgsurgs