It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences. Fair ML has largely focused on the protection of single attributes in the simpler setting where both attributes and target outcomes are binary. However, the practical application in many a real-world problem entails the simultaneous protection of multiple sensitive attributes, which are often not simply binary, but continuous or categorical. To address this more challenging task, we introduce FairCOCCO, a fairness measure built on cross-covariance operators on reproducing kernel Hilbert Spaces. This leads to two practical tools: first, the FairCOCCO Score, a normalised metric that can quantify fairness in settings with single or multiple sensitive attributes of arbitrary type; and second, a subsequent regularisation term that can be incorporated into arbitrary learning objectives to obtain fair predictors. These contributions address crucial gaps in the algorithmic fairness literature, and we empirically demonstrate consistent improvements against state-of-the-art techniques in balancing predictive power and fairness on real-world datasets.
翻译:重要的是要保证在现实世界中部署的机器学习算法不会造成不公平或意外的社会后果。公平 ML主要侧重于在更简单的环境下保护单一属性,既具有属性又具有目标结果的二元性。然而,在许多现实世界问题中的实际应用需要同时保护多重敏感属性,这些属性往往不仅仅是二进制的,而是连续的或绝对的。为了应对这一更具挑战性的任务,我们引入了FairCOCCOCO,这是建立在再生产核心Hilbert空间的跨变量操作者的公平措施。这导致两个实用工具:第一,FairCOCOC分,这是一个标准化的衡量标准,可以量化在具有单一或多重任意类型敏感属性的环境中的公平性;第二,随后的规范化术语,可以纳入任意学习的目标,以获得公平的预测。这些贡献解决了算公平文献中的关键差距,我们从经验上表明,在平衡真实世界数据集的预测力和公平性方面,国家技术不断改进。