Counterfactual explanations (CEs) are methods for generating an alternative scenario that produces a different desirable outcome. For example, if a student is predicted to fail a course, then counterfactual explanations can provide the student with alternate ways so that they would be predicted to pass. The applications are many. However, CEs are currently generated from machine learning models that do not necessarily take into account the true causal structure in the data. By doing this, bias can be introduced into the CE quantities. I propose in this study to test the CEs using Judea Pearl's method of computing counterfactuals which has thus far, surprisingly, not been seen in the counterfactual explanation (CE) literature. I furthermore evaluate these CEs on three different causal structures to show how the true underlying causal structure affects the CEs that are generated. This study presented a method of evaluating CEs using Pearl's method and it showed, (although using a limited sample size), that thirty percent of the CEs conflicted with those computed by Pearl's method. This shows that we cannot simply trust CEs and it is vital for us to know the true causal structure before we blindly compute counterfactuals using the original machine learning model.
翻译:反事实解释( CES) 是产生替代情景的方法, 产生不同的预期结果。 例如, 如果学生预计会失败, 那么反事实解释可以向学生提供替代方法, 从而预测他们通过。 应用程序是众多的 。 但是, 目前 CE 是由机器学习模型生成的, 这些模型不一定考虑到数据的真正因果结构。 通过这样做, 偏向可以引入 CE 数量 。 我在本研究中提议测试 CEs 使用Judea Pearl 计算反事实的方法, 这种方法迄今为止令人惊讶地在反事实解释( CEE) 文献中并未看到。 我进一步评估这些CE 三个不同的因果结构, 以显示真正的因果结构是如何影响生成的 CE 的。 这项研究介绍了一种使用 Pearl 方法评估 CEs 的方法, 它显示, ( 虽然使用有限的样本大小) 30% CEes 与 Pearl 计算的方法相冲突。 这显示, 我们无法简单地信任 CEs, 而对于我们在盲目地反向原始机器学习真实因果关系之前的模型至关重要 。