Despite huge gains in performance in natural language understanding via large language models in recent years, voice assistants still often fail to meet user expectations. In this study, we conducted a mixed-methods analysis of how voice assistant failures affect users' trust in their voice assistants. To illustrate how users have experienced these failures, we contribute a crowdsourced dataset of 199 voice assistant failures, categorized across 12 failure sources. Relying on interview and survey data, we find that certain failures, such as those due to overcapturing users' input, derail user trust more than others. We additionally examine how failures impact users' willingness to rely on voice assistants for future tasks. Users often stop using their voice assistants for specific tasks that result in failures for a short period of time before resuming similar usage. We demonstrate the importance of low stakes tasks, such as playing music, towards building trust after failures.
翻译:尽管近年来通过大型语言模式在自然语言理解方面取得了巨大成绩,但语音助理仍然常常未能达到用户的期望。在这项研究中,我们对语音助理的失败如何影响用户对其语音助理的信任进行了混合分析。为了说明用户是如何经历这些失败的,我们提供了199个多方源的语音助理的失败数据集,分为12个失败来源。根据访谈和调查数据,我们发现某些失败,如过度掌握用户的投入,使用户的信任比其他人更加丧失。我们还研究了失败如何影响用户在未来任务中依赖语音助理的意愿。用户往往在恢复类似使用之前的很短的时间内停止使用语音助理执行导致失败的具体任务。我们展示了低风险任务的重要性,例如播放音乐,在失败后建立信任。</s>