Fairness is an essential factor for machine learning systems deployed in high-stake applications. Among all fairness notions, individual fairness, following a consensus that `similar individuals should be treated similarly,' is a vital notion to guarantee fair treatment for individual cases. Previous methods typically characterize individual fairness as a prediction-invariant problem when perturbing sensitive attributes, and solve it by adopting the Distributionally Robust Optimization (DRO) paradigm. However, adversarial perturbations along a direction covering sensitive information do not consider the inherent feature correlations or innate data constraints, and thus mislead the model to optimize at off-manifold and unrealistic samples. In light of this, we propose a method to learn and generate antidote data that approximately follows the data distribution to remedy individual unfairness. These on-manifold antidote data can be used through a generic optimization procedure with original training data, resulting in a pure pre-processing approach to individual unfairness, or can also fit well with the in-processing DRO paradigm. Through extensive experiments, we demonstrate our antidote data resists individual unfairness at a minimal or zero cost to the model's predictive utility.
翻译:公平是运用高科技应用的机器学习系统的基本因素。在所有公平概念中,个人公平是保证个人案件得到公平待遇的重要概念。以往的方法通常将个人公平定性为在扰乱敏感属性时预测-变化问题,并通过采用分布式强力优化模式加以解决。然而,在敏感信息涵盖方向上的对抗性干扰并不考虑内在特征相关关系或内在数据限制,从而误导模型优化非自制和不现实的样本。根据这一点,我们提出一种方法,学习和生成解毒数据,大致遵循数据分配,以纠正个人不公现象。这些自制解毒数据可以通过使用与原始培训数据的一般优化程序加以使用,从而形成一种处理个人不公现象的纯预处理方法,或与处理中DRO模式相适应。通过广泛的实验,我们证明我们的解毒数据在模型预测效用的最低限度或零成本上可以抵制个人不公道。