In situations where explanations of black-box models may be useful, the fairness of the black-box is also often a relevant concern. However, the link between the fairness of the black-box model and the behavior of explanations for the black-box is unclear. We focus on explanations applied to tabular datasets, suggesting that explanations do not necessarily preserve the fairness properties of the black-box algorithm. In other words, explanation algorithms can ignore or obscure critical relevant properties, creating incorrect or misleading explanations. More broadly, we propose future research directions for evaluating and generating explanations such that they are informative and relevant from a fairness perspective.
翻译:在对黑盒模型的解释可能有用的情况下,黑盒模型的公平性往往也是一个相关的关切。然而,黑盒模型的公正性与黑盒解释行为之间的联系并不明确。我们侧重于对表格数据集的解释,表明解释不一定维护黑盒算法的公平性。换句话说,解释算法可以忽略或模糊关键的相关属性,造成错误或误导性的解释。更广义地说,我们提出今后的研究方向,用于评估和提出解释,以便从公平角度提供信息和相关性的解释。