Counterfactual explanation is one branch of interpretable machine learning that produces a perturbation sample to change the model's original decision. The generated samples can act as a recommendation for end-users to achieve their desired outputs. Most of the current counterfactual explanation approaches are the gradient-based method, which can only optimize the differentiable loss functions with continuous variables. Accordingly, the gradient-free methods are proposed to handle the categorical variables, which however present several major limitations: 1) causal relationships among features are typically ignored when generating the counterfactuals, possibly resulting in impractical guidelines for decision-makers; 2) the generation of the counterfactual sample is prohibitively slow and requires lots of parameter tuning for combining different loss functions. In this work, we propose a causal structure model to preserve the causal relationship underlying the features of the counterfactual. In addition, we design a novel gradient-free optimization based on the multi-objective genetic algorithm that generates the counterfactual explanations for the mixed-type of continuous and categorical data. Numerical experiments demonstrate that our method compares favorably with state-of-the-art methods and therefore is applicable to any prediction model. All the source code and data are available at \textit{\url{{https://github.com/tridungduong16/multiobj-scm-cf}}}.
翻译:反事实解释是可解释的机器学习的一个分支,它产生一个扭曲的样本,以改变模型的原始决定。生成的样本可以作为最终用户实现预期产出的建议。目前的反事实解释方法大多是以梯度为基础的方法,它只能以连续变量优化不同的损失功能。因此,建议采用无梯度方法来处理绝对变量,但具有若干重大局限性:1) 产生反事实时,各特征之间的因果关系通常被忽视,这可能导致为决策者制定不切实际的指导方针;2) 反事实样本的生成速度过慢,需要大量的参数调整来组合不同的损失功能。在这项工作中,我们提出了一个因果结构模型,以维护反事实特征背后的因果关系。此外,我们设计了一个新的无梯度优化,以多目标遗传算法为基础,为连续和绝对数据的混合类型提供反事实解释。Numericalimal 实验表明,我们的方法优于状态-艺术方法,因此适用于任何预测模型。所有源代码和数据都是可使用的。