Vertical federated learning (VFL) has attracted greater and greater interest since it enables multiple parties possessing non-overlapping features to strengthen their machine learning models without disclosing their private data and model parameters. Similar to other machine learning algorithms, VFL faces demands and challenges of fairness, i.e., the learned model may be unfairly discriminatory over some groups with sensitive attributes. To tackle this problem, we propose a fair VFL framework in this work. First, we systematically formulate the problem of training fair models in VFL, where the learning task is modelled as a constrained optimization problem. To solve it in a federated and privacy-preserving manner, we consider the equivalent dual form of the problem and develop an asynchronous gradient coordinate-descent ascent algorithm, where some active data parties perform multiple parallelized local updates per communication round to effectively reduce the number of communication rounds. The messages that the server sends to passive parties are deliberately designed such that the information necessary for local updates is released without intruding on the privacy of data and sensitive attributes. We rigorously study the convergence of the algorithm when applied to general nonconvex-concave min-max problems. We prove that the algorithm finds a $\delta$-stationary point of the dual objective in $\mathcal{O}(\delta^{-4})$ communication rounds under mild conditions. Finally, the extensive experiments on three benchmark datasets demonstrate the superior performance of our method in training fair models.
翻译:与其它机器学习算法相似,VFL面临要求和公平性挑战,也就是说,所学模式可能对某些具有敏感属性的群体有不公平的歧视。为了解决这一问题,我们提议在这项工作中建立一个公平的 VFL 框架。首先,我们系统地在VFL 中制定培训公平模式的问题,因为学习任务以限制优化问题为模范。为了以联合和隐私保护的方式解决这一问题,我们考虑问题相同的双重形式,并开发一种无同步的梯度协调-白度算法,即一些活跃的数据方在每轮通信中进行多重平行的地方更新,以有效减少通信周期的数量。服务器向被动方发送的信息是刻意设计的,即本地更新所需的信息在不干扰数据和敏感属性的隐私的情况下发布。(我们严格研究在将算法应用到一般的不相交价-节节能- 4 平级的硬盘- 测试三等量级数据分析中,我们发现在双轨数据分析中所使用的方法。