We tackle the problem of computing counterfactual explanations -- minimal changes to the features that flip an undesirable model prediction. We propose a solution to this question for linear Support Vector Machine (SVMs) models. Moreover, we introduce a way to account for weighted actions that allow for more changes in certain features than others. In particular, we show how to find counterfactual explanations with the purpose of increasing model interpretability. These explanations are valid, change only actionable features, are close to the data distribution, sparse, and take into account correlations between features. We cast this as a mixed integer programming optimization problem. Additionally, we introduce two novel scale-invariant cost functions for assessing the quality of counterfactual explanations and use them to evaluate the quality of our approach with a real medical dataset. Finally, we build a support vector machine model to predict whether law students will pass the Bar exam using protected features, and used our algorithms to uncover the inherent biases of the SVM.
翻译:我们处理的是计算反事实解释的问题 -- -- 将不可取的模型预测的特征转换成最小的特征。我们建议了线性支持矢量机模型(SVMs)的这一问题的解决办法。此外,我们引入了一种方法来计算加权行动,使某些特征的变化大于其他特征。特别是,我们展示了如何找到反事实解释,目的是增加模型的解释性。这些解释是有效的,改变只是可操作的特征,接近数据分布,稀少,并且考虑到各特征之间的相互关系。我们将此作为一个混合的整数编程优化问题。此外,我们引入了两个新的规模变化成本功能来评估反事实解释的质量,并用一个真正的医疗数据集来评估我们的方法的质量。最后,我们建立了一个辅助矢量机模型来预测法律学生是否会通过使用受保护特征的律师协会考试,并利用我们的算法来发现SVM的固有偏差。