了解联邦学习联盟在质量方面面临的挑战:从强健的镜头看第一眼 (Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness)

Federated learning (FL) is a widely adopted distributed learning paradigm in practice, which intends to preserve users' data privacy while leveraging the entire dataset of all participants for training. In FL, multiple models are trained independently on the users and aggregated centrally to update a global model in an iterative process. Although this approach is excellent at preserving privacy by design, FL still tends to suffer from quality issues such as attacks or byzantine faults. Some recent attempts have been made to address such quality challenges on the robust aggregation techniques for FL. However, the effectiveness of state-of-the-art (SOTA) robust FL techniques is still unclear and lacks a comprehensive study. Therefore, to better understand the current quality status and challenges of these SOTA FL techniques in the presence of attacks and faults, in this paper, we perform a large-scale empirical study to investigate the SOTA FL's quality from multiple angles of attacks, simulated faults (via mutation operators), and aggregation (defense) methods. In particular, we perform our study on two generic image datasets and one real-world federated medical image dataset. We also systematically investigate the effect of the distribution of attacks/faults over users and the independent and identically distributed (IID) factors, per dataset, on the robustness results. After a large-scale analysis with 496 configurations, we find that most mutators on each individual user have a negligible effect on the final model. Moreover, choosing the most robust FL aggregator depends on the attacks and datasets. Finally, we illustrate that it is possible to achieve a generic solution that works almost as well or even better than any single aggregator on all attacks and configurations with a simple ensemble model of aggregators.

翻译：联邦学习(FL)是在实践中广泛采用的一种分布式学习模式,目的是保护用户的数据隐私,同时利用所有参与者的全部数据集进行培训。在FL, 多个模型在用户方面独立培训,并集中集成,以便在迭接过程中更新全球模型。虽然这种方法在设计保护隐私方面非常出色,但FL仍然倾向于受到袭击或旁观断层等质量问题的影响。最近曾试图解决富有活力的聚合技术在FL(FL)方面的质量挑战。然而,最先进的FL(SOTA)技术的效力仍然不明确,缺乏全面研究。因此,为了更好地了解SOTA FL技术在攻击和断层时目前的质量和挑战。尽管这一方法在设计上很能保护隐私,但FL(FL)仍然倾向于从攻击的多重角度、模拟故障(通过突变操作者模型)和集成法(Ref)方法等质量问题。我们用两种可能通用的图像集成和一种真实化的FL(FL)技术的有效性仍然不够明确。我们用两种方法来进行研究,我们用最简单、最精确的医学成型的医学图像模型进行更精确的模型分析。我们几乎能说明SOLLLL的每个用户在大规模攻击之后的每次分析。我们更精确地分析结果上是如何分析。我们更精确地分析。