Counterfactual explanation is a common class of methods to make local explanations of machine learning decisions. For a given instance, these methods aim to find the smallest modification of feature values that changes the predicted decision made by a machine learning model. One of the challenges of counterfactual explanation is the efficient generation of realistic counterfactuals. To address this challenge, we propose VCNet-Variational Counter Net-a model architecture that combines a predictor and a counterfactual generator that are jointly trained, for regression or classification tasks. VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem. Our contribution is the generation of counterfactuals that are close to the distribution of the predicted class. This is done by learning a variational autoencoder conditionally to the output of the predictor in a join-training fashion. We present an empirical evaluation on tabular datasets and across several interpretability metrics. The results are competitive with the state-of-the-art method.
翻译:反事实解释是当地解释机器学习决定的一种常见方法。 对于一个特定的例子,这些方法旨在找到改变机器学习模型预测决定的最小地修改特征值,改变机器学习模型的预测决定。反事实解释的挑战之一是有效产生现实的反事实解释。为了应对这一挑战,我们提议VCNet-Variational against Net-a模型结构,将预测器和联合训练的反事实生成器结合起来,用于回归或分类任务。VCNet既能产生预测,又能产生反事实解释,而不必解决另一个最小化问题。我们的贡献是产生接近预测等级分布的反事实。这是通过以联合培训方式学习与预测器输出有条件的变式自动编码。我们对表格数据集和若干可解释性指标进行实证评价。结果与最先进的方法具有竞争力。