Assessing communication and collaboration at scale depends on a labor intensive task of coding communication data into categories according to different frameworks. Prior research has established that ChatGPT can be directly instructed with coding rubrics to code the communication data and achieves accuracy comparable to human raters. However, whether the coding from ChatGPT or similar AI technology exhibits bias against different demographic groups, such as gender and race, remains unclear. To fill this gap, this paper investigates ChatGPT-based automated coding of communication data using a typical coding framework for collaborative problem solving, examining differences across gender and racial groups. The analysis draws on data from three types of collaborative tasks: negotiation, problem solving, and decision making. Our results show that ChatGPT-based coding exhibits no significant bias across gender and racial groups, paving the road for its adoption in large-scale assessment of collaboration and communication.
翻译:大规模评估交流与协作能力依赖于一项劳动密集型任务:依据不同框架将交流数据编码为特定类别。先前研究已证实,ChatGPT可直接通过编码规则指令对交流数据进行编码,并达到与人类评分者相当的准确度。然而,ChatGPT或类似AI技术生成的编码是否对不同人口统计群体(如性别与种族)存在偏见,目前尚不明确。为填补这一空白,本文采用协作问题解决的典型编码框架,研究基于ChatGPT的交流数据自动编码在性别与种族群体间的差异。分析数据涵盖三类协作任务:谈判、问题解决与决策制定。研究结果表明,基于ChatGPT的编码在性别与种族群体间未呈现显著偏见,这为其在大规模协作与交流评估中的应用铺平了道路。