向没有敏感属性的公平分类者倾斜:探索相关地物中的两面 (Towards Fair Classifiers Without Sensitive Attributes: Exploring Biases in Related Features)

Despite the rapid development and great success of machine learning models, extensive studies have exposed their disadvantage of inheriting latent discrimination and societal bias from the training data. This phenomenon hinders their adoption on high-stake applications. Thus, many efforts have been taken for developing fair machine learning models. Most of them require that sensitive attributes are available during training to learn fair models. However, in many real-world applications, it is usually infeasible to obtain the sensitive attributes due to privacy or legal issues, which challenges existing fair-ensuring strategies. Though the sensitive attribute of each data sample is unknown, we observe that there are usually some non-sensitive features in the training data that are highly correlated with sensitive attributes, which can be used to alleviate the bias. Therefore, in this paper, we study a novel problem of exploring features that are highly correlated with sensitive attributes for learning fair and accurate classifiers. We theoretically show that by minimizing the correlation between these related features and model prediction, we can learn a fair classifier. Based on this motivation, we propose a novel framework which simultaneously uses these related features for accurate prediction and enforces fairness. In addition, the model can dynamically adjust the regularization weight of each related feature to balance its contribution on model classification and fairness. Experimental results on real-world datasets demonstrate the effectiveness of the proposed model for learning fair models with high classification accuracy.

翻译：尽管机器学习模式迅速发展和取得巨大成功,但广泛的研究暴露了他们从培训数据中继承潜在歧视和社会偏见的劣势,这种现象妨碍了他们从培训数据中继承潜在歧视和社会偏见。因此,为开发公平的机器学习模式作出了许多努力。因此,我们已作出许多努力,以开发公平的机器学习模式。其中多数要求培训期间具备敏感属性,以学习公平的模型。然而,在许多现实世界应用中,由于隐私或法律问题而获得敏感属性通常不可行,从而挑战现有的公平明智战略。尽管目前尚不清楚每个数据样本的敏感属性,但我们注意到,培训数据中通常有一些与敏感属性高度相关的非敏感特征,可用于减轻偏见。因此,我们在本文件中研究一个新问题,即探索与学习公平、准确的分类方法的敏感属性高度关联性特征。我们理论上表明,通过尽量减少这些相关特征与模型和模型预测之间的联系,我们可以学到一个公平的分类方法。我们提议了一个新的框架,同时利用这些相关特征进行准确的预测和实施公平性。此外,模型可以动态地调整每个相关数据分类的正规性比重。我们从理论上展示了与各项相关模型的公平性分析结果。