Fairness and explainability are two important and closely related requirements of decision making systems. While fairness and explainability of decision making systems have been extensively studied independently, only little effort has been put into studying fairness of explanations on their own. Current explanations can be unfair to an individual: an example is given by counterfactual explanations which propose different actions to change the output class to two similar individuals. In this work we formally and empirically study individual fairness and its mathematical formalization as robustness for counterfactual explanations as a prominent instance of contrasting explanations. In addition, we propose to use plausible counterfactuals instead of closest counterfactuals for improving the individual fairness of counterfactual explanations.
翻译:公平和解释是决策系统两个重要和密切相关的要求。虽然对决策系统的公平和解释性进行了广泛独立研究,但只对自身解释的公正性进行了很少的研究。目前的解释对个人来说可能是不公平的:反事实的解释就是一个例子,它建议采取不同的行动,将产出类别改为两个相似的个人。在这项工作中,我们正式和实证地研究个人公平及其数学正规化,将其作为反事实解释的有力性,作为对比解释的突出例子。此外,我们提议使用貌似相反的事实,而不是最接近的反事实,以提高反事实解释的个人公正性。