We investigate a formalism for the conditions of a successful explanation of AI. We consider "success" to depend not only on what information the explanation contains, but also on what information the human explainee understands from it. Theory of mind literature discusses the folk concepts that humans use to understand and generalize behavior. We posit that folk concepts of behavior provide us with a "language" that humans understand behavior with. We use these folk concepts as a framework of *social attribution* by the human explainee -- the information constructs that humans are likely to comprehend from explanations -- by introducing a blueprint for an explanatory narrative (Figure 1) that explains AI behavior with these constructs. We then demonstrate that many XAI methods today can be mapped to folk concepts of behavior in a qualitative evaluation. This allows us to uncover their failure modes that prevent current methods from explaining successfully -- i.e., the information constructs that are missing for any given XAI method, and whose inclusion can decrease the likelihood of misunderstanding AI behavior.
翻译:我们认为“成功”不仅取决于解释中包含的信息,而且取决于人类解释中理解的信息。思想文献理论讨论了人类用来理解和概括行为的民间概念。我们认为,民间行为概念为我们提供了一种人类理解行为的“语言”。我们将这些民间概念作为人类解释者从解释中理解的*社会归属*的框架 -- -- 人类从解释中可能理解的信息结构 -- -- 引入解释性说明的蓝图(图1),用这些解释解释来解释AI的行为。我们随后表明,今天许多XAI方法可以在定性评估中被描述为民间行为概念。这使我们能够发现其失败模式,阻止当前方法成功解释,即,任何XAI方法所缺少的信息结构,其纳入可以减少AI行为误解的可能性。