Existing research on fairness-aware recommendation has mainly focused on the quantification of fairness and the development of fair recommendation models, neither of which studies a more substantial problem--identifying the underlying reason of model disparity in recommendation. This information is critical for recommender system designers to understand the intrinsic recommendation mechanism and provides insights on how to improve model fairness to decision makers. Fortunately, with the rapid development of Explainable AI, we can use model explainability to gain insights into model (un)fairness. In this paper, we study the problem of explainable fairness, which helps to gain insights about why a system is fair or unfair, and guides the design of fair recommender systems with a more informed and unified methodology. Particularly, we focus on a common setting with feature-aware recommendation and exposure unfairness, but the proposed explainable fairness framework is general and can be applied to other recommendation settings and fairness definitions. We propose a Counterfactual Explainable Fairness framework, called CEF, which generates explanations about model fairness that can improve the fairness without significantly hurting the performance.The CEF framework formulates an optimization problem to learn the "minimal" change of the input features that changes the recommendation results to a certain level of fairness. Based on the counterfactual recommendation result of each feature, we calculate an explainability score in terms of the fairness-utility trade-off to rank all the feature-based explanations, and select the top ones as fairness explanations.
翻译:有关公平意识建议的现有研究主要侧重于公平量化和制定公平建议模式,两者都没有研究一个更实质性的问题,查明建议中模式差异的根本原因。这一信息对于推荐系统设计者了解内在建议机制至关重要,并就如何改善对决策者的公平性提供了深刻见解。幸运的是,随着可解释的AI的迅速发展,我们可以使用模型解释性来深入了解模型(不公平)的公平性。在本文件中,我们研究可解释性的问题,这有助于了解为什么一个系统是公平或不公平的,并且以更加知情和统一的方法指导公平建议系统的设计。特别是,我们侧重于一个共同的设置,其中含有了解特性的建议和暴露的不公平性,但拟议的可解释性框架是一般性的,可以适用于其他建议设置和公平性定义。我们提出一个反事实性解释公平性框架,称为CEF,它能解释模型的公平性可以提高公平性,但不会极大地损害业绩。 CEF框架提出了一个最优化的问题,以学习公平性投入的“最小性”变化特征,在每一标准中将公平性解释结果推算出某种程度。