AI models are constantly evolving, with new versions released frequently. Human-AI interaction guidelines encourage notifying users about changes in model capabilities, ideally supported by thorough benchmarking. However, as AI systems integrate into domain-specific workflows, exhaustive benchmarking can become impractical, often resulting in silent or minimally communicated updates. This raises critical questions: Can users notice these updates? What cues do they rely on to distinguish between models? How do such changes affect their behavior and task performance? We address these questions through two studies in the context of facial recognition for historical photo identification: an online experiment examining users' ability to detect model updates, followed by a diary study exploring perceptions in a real-world deployment. Our findings highlight challenges in noticing AI model updates, their impact on downstream user behavior and performance, and how they lead users to develop divergent folk theories. Drawing on these insights, we discuss strategies for effectively communicating model updates in AI-infused systems.
翻译:暂无翻译