Vertical federated learning (VFL) is a privacy-preserving machine learning paradigm that can learn models from features distributed on different platforms in a privacy-preserving way. Since in real-world applications the data may contain bias on fairness-sensitive features (e.g., gender), VFL models may inherit bias from training data and become unfair for some user groups. However, existing fair ML methods usually rely on the centralized storage of fairness-sensitive features to achieve model fairness, which are usually inapplicable in federated scenarios. In this paper, we propose a fair vertical federated learning framework (FairVFL), which can improve the fairness of VFL models. The core idea of FairVFL is to learn unified and fair representations of samples based on the decentralized feature fields in a privacy-preserving way. Specifically, each platform with fairness-insensitive features first learns local data representations from local features. Then, these local representations are uploaded to a server and aggregated into a unified representation for the target task. In order to learn fair unified representations, we send them to each platform storing fairness-sensitive features and apply adversarial learning to remove bias from the unified representations inherited from the biased data. Moreover, for protecting user privacy, we further propose a contrastive adversarial learning method to remove privacy information from the unified representations in server before sending them to the platforms keeping fairness-sensitive features. Experiments on two real-world datasets validate that our method can effectively improve model fairness with user privacy well-protected.
翻译:纵向垂直学习(VFL)是一种保护隐私的机器学习模式,它可以以保护隐私的方式从不同平台上分布的特征中学习模型; 由于在现实应用中,数据可能包含对公平敏感特征(例如性别)的偏差, VFL模式可能从培训数据中继承偏差,对某些用户群体来说是不公平的; 然而,现有的公平ML方法通常依赖于集中储存对公平敏感的特征,以实现模型公平,而这些特征通常不适用于联合情景。 在本文中,我们提议一个公平的垂直联合学习框架(FairVFLL),该框架可以提高VFL模式的公平性。FairVFLL的核心理念是学习基于分散特征领域(例如性别)的样本的统一和公平表述方式,具体地说,每个具有对公平不敏感特征的平台先从当地特征学习当地数据。然后,这些地方代表被上传到服务器,并归为目标任务的统一代表制。为了学习公平统一的观点,我们向每个平台发送它们储存对公平敏感的特征,并且应用对敌对式学习从统一用户的对等模式中消除对用户的偏见。