Federated learning (FL) provides an effective collaborative training paradigm, allowing local agents to train a global model jointly without sharing their local data to protect privacy. However, due to the heterogeneous nature of local data, it is challenging to optimize or even define the fairness of the trained global model for the agents. For instance, existing work usually considers accuracy equity as fairness for different agents in FL, which is limited, especially under the heterogeneous setting, since it is intuitively "unfair" to enforce agents with high-quality data to achieve similar accuracy to those who contribute low-quality data. In this work, we aim to address such limitations and propose a formal fairness definition in FL, fairness via agent-awareness (FAA), which takes different contributions of heterogeneous agents into account. Under FAA, the performance of agents with high-quality data will not be sacrificed just due to the existence of large amounts of agents with low-quality data. In addition, we propose a fair FL training algorithm based on agent clustering (FOCUS) to achieve fairness in FL measured by FAA. Theoretically, we prove the convergence and optimality of FOCUS under mild conditions for linear and general convex loss functions with bounded smoothness. We also prove that FOCUS always achieves higher fairness in terms of FAA compared with standard FedAvg under both linear and general convex loss functions. Empirically, we evaluate FOCUS on four datasets, including synthetic data, images, and texts under different settings, and we show that FOCUS achieves significantly higher fairness in terms of FAA while maintaining similar or even higher prediction accuracy compared with FedAvg and other existing fair FL algorithms.
翻译:联邦学习(FL)提供了一个有效的合作培训模式,使当地代理商能够在不分享当地数据的情况下联合培训一个全球模型,以保护隐私;然而,由于当地数据性质多种多样,优化甚至界定经过培训的全球模型对代理人的公平性具有挑战性,例如,现有工作通常认为准确性是FL中不同代理人的公平性,而这种公平性是有限的,特别是在多种情况下,因为执行具有高质量数据的代理商具有直觉的“不公平性”,以便实现与提供低质量数据的代理商相似的准确性;在这项工作中,我们力求解决这些局限性,提出在FL中提出一个正式的公平性定义,通过机构认识实现公平性(FAA)提出一个正式的公平性定义,其中考虑到不同不同不同机构的贡献;在FA中,拥有高质量数据的代理商的性不会仅仅因为存在大量具有低质量数据,我们建议基于代理商群的公平性培训算法(FOCUS)实现公平性,我们证明FOC的趋性与直线性、直线性、直线性、直径直径直径直径的FA功能,同时证明我们始终在一般的FO值中实现了。