走向人类-大赦国际决策科学:经验研究调查 (Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies)

As AI systems demonstrate increasingly strong predictive performance, their adoption has grown in numerous domains. However, in high-stakes domains such as criminal justice and healthcare, full automation is often not desirable due to safety, ethical, and legal concerns, yet fully manual approaches can be inaccurate and time consuming. As a result, there is growing interest in the research community to augment human decision making with AI assistance. Besides developing AI technologies for this purpose, the emerging field of human-AI decision making must embrace empirical approaches to form a foundational understanding of how humans interact and work with AI to make decisions. To invite and help structure research efforts towards a science of understanding and improving human-AI decision making, we survey recent literature of empirical human-subject studies on this topic. We summarize the study design choices made in over 100 papers in three important aspects: (1) decision tasks, (2) AI models and AI assistance elements, and (3) evaluation metrics. For each aspect, we summarize current trends, discuss gaps in current practices of the field, and make a list of recommendations for future research. Our survey highlights the need to develop common frameworks to account for the design and research spaces of human-AI decision making, so that researchers can make rigorous choices in study design, and the research community can build on each other's work and produce generalizable scientific knowledge. We also hope this survey will serve as a bridge for HCI and AI communities to work together to mutually shape the empirical science and computational technologies for human-AI decision making.

翻译：由于大赦国际系统显示出日益强劲的预测性业绩,这些系统的采用在许多领域都有了增长,然而,在刑事司法和保健等高级领域,由于安全、伦理和法律关切,完全自动化往往不可取,但完全人工方法可能不准确和耗时,因此,研究界对通过大赦国际援助加强人类决策的兴趣越来越大,除了为此目的开发大赦国际技术外,新兴的人类-大赦国际决策领域必须包含经验性做法,以形成对人类如何互动和如何与大赦国际合作以作出决定的基础性理解。为了邀请和帮助组织研究工作,促进了解科学和改进人类-大赦国际决策,我们调查关于这一专题的经验性人类主题研究的最新文献。我们总结了100多份论文中就三个重要方面所作的研究设计选择:(1) 决策任务,(2) AI模型和AI援助要素,以及(3) 评价指标。关于每个方面,我们总结目前的趋势,讨论目前该领域做法的差距,并为未来研究提出建议清单。我们的调查强调需要为人类-大赦国际决策的设计和研究空间制定共同框架,因此,研究人员还可以在每项研究中作出严格的科学研究选择,从而为人类-大赦国际决策做出一项可靠的研究,从而能够为人类-大赦国际工作作出严格的选择。