Albeit the tremendous performance improvements in designing complex artificial intelligence (AI) systems in data-intensive domains, the black-box nature of these systems leads to the lack of trustworthiness. Post-hoc interpretability methods explain the prediction of a black-box ML model for a single instance, and such explanations are being leveraged by domain experts to diagnose the underlying biases of these models. Despite their efficacy in providing valuable insights, existing approaches fail to deliver consistent and reliable explanations. In this paper, we propose an active learning-based technique called UnRAvEL (Uncertainty driven Robust Active Learning Based Locally Faithful Explanations), which consists of a novel acquisition function that is locally faithful and uses uncertainty-driven sampling based on the posterior distribution on the probabilistic locality using Gaussian process regression(GPR). We present a theoretical analysis of UnRAvEL by treating it as a local optimizer and analyzing its regret in terms of instantaneous regrets over a global optimizer. We demonstrate the efficacy of the local samples generated by UnRAvEL by incorporating different kernels such as the Matern and linear kernels in GPR. Through a series of experiments, we show that UnRAvEL outperforms the baselines with respect to stability and local fidelity on several real-world models and datasets. We show that UnRAvEL is an efficient surrogate dataset generator by deriving importance scores on this surrogate dataset using sparse linear models. We also showcase the sample efficiency and flexibility of the developed framework on the Imagenet dataset using a pre-trained ResNet model.
翻译:尽管在设计数据密集域的复杂的人工智能(AI)系统方面业绩的极大改进,但这些系统的黑箱性质导致缺乏信任性。 热后解释方法解释了一个单一实例的黑箱 ML 模型的预测,而且域专家正在利用这种解释来分析这些模型的基本偏向。 尽管这些模型在提供宝贵的洞见方面的效率很高,但现有方法未能提供一致和可靠的解释。 在本文件中,我们提议一种积极的学习技术,称为UnravEL(由不确定性驱动的Robust积极学习基于地方的忠实解释),它包括一种新的获取图像功能,该功能也忠实于当地,并使用基于在概率地方分布的黑箱 MLML模型(GPR) 的不确定性驱动的抽样。 我们对 UnravEL 进行理论分析,将它视为一个地方的优化器,并分析它对全球优化器的瞬间遗憾。 我们展示了由UranvEL生成的本地样本的功效, 将不同的直径(例如Mater和线性) 的直径(Orli) 的内核) 样在GPR 的模型中以真实性数据显示真实性数据库显示一个真实性的数据。