Recent work shows issues of consistency with explanations, with methods generating local explanations that seem reasonable instance-wise, but that are inconsistent across instances. This suggests not only that instance-wise explanations can be unreliable, but mainly that, when interacting with a system via multiple inputs, a user may actually lose confidence in the system. To better analyse this issue, in this work we treat explanations as objects that can be subject to reasoning and present a formal model of the interactive scenario between user and system, via sequences of inputs, outputs, and explanations. We argue that explanations can be thought of as committing to some model behaviour (even if only prima facie), suggesting a form of entailment, which, we argue, should be thought of as non-monotonic. This allows: 1) to solve some considered inconsistencies in explanation, such as via a specificity relation; 2) to consider properties from the non-monotonic reasoning literature and discuss their desirability, gaining more insight on the interactive explanation scenario.
翻译:最近的工作显示了与解释的一致性问题,与产生当地解释的方法的一致性问题,这些方法似乎从实例来看是合理的,但在各种实例中却不一致。这不仅表明实例解释可能不可靠,而且主要表明,在通过多种投入与系统互动时,用户实际上可能丧失对系统的信心。为了更好地分析这一问题,在这项工作中,我们把解释视为可加以推理的对象,并通过投入、产出和解释的顺序,提出用户和系统之间互动设想的正式模式。我们争辩说,解释可以被视为对某种示范行为的承诺(即使只是初步的),表明一种需要形式,我们认为,这种形式应该被认为是非流动式的。这样可以:(1) 解决一些考虑过的解释不一致,例如通过特殊性关系;(2) 考虑非流动推理文献的属性,并讨论其可取性,对互动解释设想有更深入的了解。