Counterfactual explanations (CFXs) provide human-understandable justifications for model predictions, enabling actionable recourse and enhancing interpretability. To be reliable, CFXs must avoid regions of high predictive uncertainty, where explanations may be misleading or inapplicable. However, existing methods often neglect uncertainty or lack principled mechanisms for incorporating it with formal guarantees. We propose CONFEX, a novel method for generating uncertainty-aware counterfactual explanations using Conformal Prediction (CP) and Mixed-Integer Linear Programming (MILP). CONFEX explanations are designed to provide local coverage guarantees, addressing the issue that CFX generation violates exchangeability. To do so, we develop a novel localised CP procedure that enjoys an efficient MILP encoding by leveraging an offline tree-based partitioning of the input space. This way, CONFEX generates CFXs with rigorous guarantees on both predictive uncertainty and optimality. We evaluate CONFEX against state-of-the-art methods across diverse benchmarks and metrics, demonstrating that our uncertainty-aware approach yields robust and plausible explanations.
翻译:反事实解释(CFXs)为模型预测提供人类可理解的依据,实现可操作的补救措施并增强可解释性。为确保可靠性,CFXs必须避开预测不确定性高的区域,在这些区域中解释可能具有误导性或不适用。然而,现有方法常忽略不确定性,或缺乏将其与形式化保证相结合的原则性机制。我们提出CONFEX,一种利用保形预测(CP)和混合整数线性规划(MILP)生成不确定性感知反事实解释的新方法。CONFEX解释旨在提供局部覆盖保证,以解决CFX生成违反可交换性的问题。为此,我们开发了一种新颖的局部化CP流程,通过利用输入空间的离线树基划分,实现了高效的MILP编码。通过这种方式,CONFEX生成的CFXs在预测不确定性和最优性方面均具有严格保证。我们在多样化基准和指标上对CONFEX与前沿方法进行比较评估,结果表明我们的不确定性感知方法能够产生鲁棒且合理的解释。