Post-hoc explanation methods have become increasingly depended upon for understanding black-box classifiers in high-stakes applications, precipitating a need for reliable explanations. While numerous explanation methods have been proposed, recent works have shown that many existing methods can be inconsistent or unstable. In addition, high-performing classifiers are often highly nonlinear and can exhibit complex behavior around the decision boundary, leading to brittle or misleading local explanations. Therefore, there is an impending need to quantify the uncertainty of such explanation methods in order to understand when explanations are trustworthy. We introduce a novel uncertainty quantification method parameterized by a Gaussian Process model, which combines the uncertainty approximation of existing methods with a novel geodesic-based similarity which captures the complexity of the target black-box decision boundary. The proposed framework is highly flexible; it can be used with any black-box classifier and feature attribution method to amortize uncertainty estimates for explanations. We show theoretically that our proposed geodesic-based kernel similarity increases with the complexity of the decision boundary. Empirical results on multiple tabular and image datasets show that our decision boundary-aware uncertainty estimate improves understanding of explanations as compared to existing methods.
翻译:电热解解析方法越来越取决于理解高吸量应用中的黑盒分类方法,从而导致需要可靠的解释。虽然提出了许多解释方法,但最近的工作表明,许多现有方法可能不一致或不稳定。此外,高性分类方法往往高度非线性,在决定边界周围可能出现复杂的行为,导致易碎或误导当地解释。因此,即将需要量化这种解释方法的不确定性,以便在解释可信时了解这些解释方法的不确定性。我们引入了一种由高斯进程模型参数参数化的新的不确定性量化方法。高斯进程模型将现有方法的不确定性近似性与新的基于大地特征的相似性结合起来,从而捕捉到目标黑盒决定边界的复杂性。拟议框架非常灵活,可以与任何黑盒分类分类和特征归属方法一起使用,以迷惑不确定性估计解释。我们理论上表明,我们提议的基于地德内心的内核相类似性与决定边界的复杂性增加了。多个表格和图像数据集的实证结果显示,我们的决定边界不确定性估计比现有边界的不确定性的解释方法更加清楚。