In the environment of fair lending laws and the General Data Protection Regulation (GDPR), the ability to explain a model's prediction is of paramount importance. High quality explanations are the first step in assessing fairness. Counterfactuals are valuable tools for explainability. They provide actionable, comprehensible explanations for the individual who is subject to decisions made from the prediction. It is important to find a baseline for producing them. We propose a simple method for generating counterfactuals by using gradient descent to search in the latent space of an autoencoder and benchmark our method against approaches that search for counterfactuals in feature space. Additionally, we implement metrics to concretely evaluate the quality of the counterfactuals. We show that latent space counterfactual generation strikes a balance between the speed of basic feature gradient descent methods and the sparseness and authenticity of counterfactuals generated by more complex feature space oriented techniques.
翻译:在公平贷款法和一般数据保护条例的环境下,解释模型预测的能力至关重要,高质量的解释是评估公平性的第一步,反事实是有价值的解释工具,为受预测决定制约的个人提供了可操作、易懂的解释,必须找到制定这些法律和一般数据保护条例的基准。我们提出了一个简单的方法,通过利用梯度下沉在自动编码器的潜在空间中搜寻反事实,并参照在地物空间寻找反事实的方法来衡量我们的方法。此外,我们实施了具体评估反事实质量的标准。我们表明,潜在空间反事实生成在基本地貌梯度下降方法的速度与更复杂的地物空间导向技术产生的反事实的稀少性和真实性之间取得平衡。