As increasingly complex AI systems are introduced into our daily lives, it becomes important for such systems to be capable of explaining the rationale for their decisions and allowing users to contest these decisions. A significant hurdle to allowing for such explanatory dialogue could be the vocabulary mismatch between the user and the AI system. This paper introduces methods for providing contrastive explanations in terms of user-specified concepts for sequential decision-making settings where the system's model of the task may be best represented as an inscrutable model. We do this by building partial symbolic models of a local approximation of the task that can be leveraged to answer the user queries. We test these methods on a popular Atari game (Montezuma's Revenge) and variants of Sokoban (a well-known planning benchmark) and report the results of user studies to evaluate whether people find explanations generated in this form useful.
翻译:由于在日常生活中引入了日益复杂的AI系统,因此,这种系统必须能够解释其决定的理由,并允许用户对这些决定提出异议。允许进行这种解释性对话的一个重大障碍可能是用户与AI系统的词汇不匹配。本文件介绍了在顺序决策设置中以用户指定的概念提供对比性解释的方法,在这种环境中,系统的任务模式最能作为不可估量的模式来体现。我们这样做的方式是建立局部近似可用来回答用户询问的任务的局部象征性模型。我们用流行的Atari游戏(Montezuma's Revenge)和Sokoban的变体(一个众所周知的规划基准)测试这些方法,并报告用户研究结果,以评价人们是否发现以这种形式产生的解释有用。