We turn the definition of individual fairness on its head---rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold: First, we introduce the definition of a minimal metric and characterize the behavior of models in terms of minimal metrics. Second, for more complicated models, we apply the mechanism of randomized smoothing from adversarial robustness to make them individually fair under a given weighted $L^p$ metric. Our experiments show that adapting the minimal metrics of linear models to more complicated neural networks can lead to meaningful and interpretable fairness guarantees at little cost to utility.
翻译:我们把个人公平的定义放在头上,而不是确定给定的模型是否公平,而不是确定给定的标准,我们找到一个符合个人公平性的标准,这可以促进关于模式公平性的讨论,解决难以事先指定一个适当指标的问题。我们的贡献有两个方面:第一,我们提出最低限度指标的定义,并以最起码的衡量标准来描述模型的行为。第二,对于更复杂的模型,我们采用从对抗性稳健性随机调整的平滑机制,使它们在给定的加权值$L ⁇ p$的衡量标准下实现个人公平。我们的实验表明,将线性模型的最低限度指标改用更复杂的神经网络可以带来有意义的和可解释的公平性保障,但使用成本很小。