People work with AI systems to improve their decision making, but often under- or over-rely on AI predictions and perform worse than they would have unassisted. To help people appropriately rely on AI aids, we propose showing them behavior descriptions, details of how AI systems perform on subgroups of instances. We tested the efficacy of behavior descriptions through user studies with 225 participants in three distinct domains: fake review detection, satellite image classification, and bird classification. We found that behavior descriptions can increase human-AI accuracy through two mechanisms: helping people identify AI failures and increasing people's reliance on the AI when it is more accurate. These findings highlight the importance of people's mental models in human-AI collaboration and show that informing people of high-level AI behaviors can significantly improve AI-assisted decision making.
翻译:人们与AI系统合作,改善他们的决策,但往往在AI预测方面不够或过度,并且比他们没有援助的预测更差。为了帮助人们适当依赖AI辅助工具,我们建议展示他们的行为描述,AI系统如何在实例分组上运作的细节。我们通过用户研究测试了行为描述的有效性,在三个不同的领域有225名参与者参与:假审查检测、卫星图像分类和鸟类分类。我们发现,行为描述可以通过两个机制提高人类-AI的准确性:帮助人们识别AI的失败,在AI更加准确时增加人们对AI的依赖。这些发现突显了人们的精神模型在人类-AI合作中的重要性,并表明向人们通报高级AI行为可以大大改进AI协助的决策。