A growing literature studies how humans incorporate advice from algorithms. This study examines an algorithm with millions of daily users: ChatGPT. In a preregistered study, 118 student participants answer 2,828 multiple-choice questions across 25 academic subjects. Participants receive advice from a GPT model and can update their initial responses. The advisor's identity ("AI chatbot" versus a human "expert"), presence of a written justification, and advice correctness do not significantly affect weight on advice. Instead, participants weigh advice more heavily if they (1) are unfamiliar with the topic, (2) used ChatGPT in the past, or (3) received more accurate advice previously. The last two effects -- algorithm familiarity and experience -- are stronger with an AI chatbot as the advisor. Participants that receive written justifications are able to discern correct advice and update accordingly. Student participants are miscalibrated in their judgements of ChatGPT advice accuracy; one reason is that they significantly misjudge the accuracy of ChatGPT on 11/25 topics. Participants under-weigh advice by over 50% and can score better by trusting ChatGPT more.
翻译:暂无翻译