Federated learning (FL) provides an effective paradigm to train machine learning models over distributed data with privacy protection. However, recent studies show that FL is subject to various security, privacy, and fairness threats due to the potentially malicious and heterogeneous local agents. For instance, it is vulnerable to local adversarial agents who only contribute low-quality data, with the goal of harming the performance of those with high-quality data. This kind of attack hence breaks existing definitions of fairness in FL that mainly focus on a certain notion of performance parity. In this work, we aim to address this limitation and propose a formal definition of fairness via agent-awareness for FL (FAA), which takes the heterogeneous data contributions of local agents into account. In addition, we propose a fair FL training algorithm based on agent clustering (FOCUS) to achieve FAA. Theoretically, we prove the convergence and optimality of FOCUS under mild conditions for linear models and general convex loss functions with bounded smoothness. We also prove that FOCUS always achieves higher fairness measured by FAA compared with standard FedAvg protocol under both linear models and general convex loss functions. Empirically, we evaluate FOCUS on four datasets, including synthetic data, images, and texts under different settings, and we show that FOCUS achieves significantly higher fairness based on FAA while maintaining similar or even higher prediction accuracy compared with FedAvg.
翻译:联邦学习联盟(FL)为培训机器学习模式提供了有效的范例,以培训与分发的数据相比,保护隐私;然而,最近的研究表明,由于潜在的恶意和多样的地方代理人,FL受到各种安全、隐私和公平威胁,可能恶意和多样的地方代理人,因此,FL容易受到各种安全、隐私和公平威胁的威胁;例如,它容易受到地方敌对代理人的伤害,这些代理人只提供低质量数据,目的是损害那些拥有高质量数据的人的性能;这种攻击因此打破了FL的公平性定义,主要侧重于某种业绩均等概念;在这项工作中,我们力求解决这一局限性,提出一个正式的公平性定义,通过FL(FAA)的代理认识,将当地代理人的多种数据贡献考虑在内;此外,我们建议基于代理人集群(FOCUS)的公平性公平性公平性公平性公平性,从理论上说,我们证明FOCUS在线性模型和一般的平稳性损失功能下,与FA的标准协议相比,公平性评估的公平性,我们根据直线式模型和一般组合损失功能,对FOS的更精确性进行大幅评估,同时,我们根据基于FMOC的数据显示,我们根据不同的数据显示,我们与FMOA的更高级的更高级的更高级的文本,我们根据不同的数据显示不同的数据。