Federated Learning (FL) is a distributed machine learning protocol that allows a set of agents to collaboratively train a model without sharing their datasets. This makes FL particularly suitable for settings where data privacy is desired. However, it has been observed that the performance of FL is closely related to the similarity of the local data distributions of agents. Particularly, as the data distributions of agents differ, the accuracy of the trained models drop. In this work, we look at how variations in local data distributions affect the fairness and the robustness properties of the trained models in addition to the accuracy. Our experimental results indicate that, the trained models exhibit higher bias, and become more susceptible to attacks as local data distributions differ. Importantly, the degradation in the fairness, and robustness can be much more severe than the accuracy. Therefore, we reveal that small variations that have little impact on the accuracy could still be important if the trained model is to be deployed in a fairness/security critical context.
翻译:联邦学习组织(FL)是一个分布式的机器学习协议,它使一组代理商能够在不分享其数据集的情况下合作培训模型,从而使FL特别适合需要数据隐私的环境。然而,据观察,FL的性能与当地代理商数据分布的相似性密切相关。特别是,由于代理商的数据分布不同,经过培训的模型的准确性下降。在这项工作中,我们观察当地数据分布的变异如何影响经过培训的模型的公正性和稳健性以及准确性。我们的实验结果表明,经过培训的模型表现出更高的偏向性,随着当地数据分布的不同,更容易受到攻击。重要的是,公平性和稳健性方面的退化可能比准确性严重得多。因此,我们发现,如果将经过培训的模型部署在公平/安全的关键环境中,那么对准确性影响不大的微小变化可能仍然很重要。