理解联邦对神经网络学习联盟的质量挑战:从强健的镜头看第一眼 (Towards Understanding Quality Challenges of the Federated Learning for Neural Networks: A First Look from the Lens of Robustness)

Federated learning (FL) is a distributed learning paradigm that preserves users' data privacy while leveraging the entire dataset of all participants. In FL, multiple models are trained independently on the clients and aggregated centrally to update a global model in an iterative process. Although this approach is excellent at preserving privacy, FL still suffers from quality issues such as attacks or byzantine faults. Recent attempts have been made to address such quality challenges on the robust aggregation techniques for FL. However, the effectiveness of state-of-the-art (SOTA) robust FL techniques is still unclear and lacks a comprehensive study. Therefore, to better understand the current quality status and challenges of these SOTA FL techniques in the presence of attacks and faults, we perform a large-scale empirical study to investigate the SOTA FL's quality from multiple angles of attacks, simulated faults (via mutation operators), and aggregation (defense) methods. In particular, we study FL's performance on the image classification tasks and use DNNs as our model type. Furthermore, we perform our study on two generic image datasets and one real-world federated medical image dataset. We also investigate the effect of the proportion of affected clients and the dataset distribution factors on the robustness of FL. After a large-scale analysis with 496 configurations, we find that most mutators on each user have a negligible effect on the final model in the generic datasets, and only one of them is effective in the medical dataset. Furthermore, we show that model poisoning attacks are more effective than data poisoning attacks. Moreover, choosing the most robust FL aggregator depends on the attacks and datasets. Finally, we illustrate that a simple ensemble of aggregators achieves a more robust solution than any single aggregator and is the best choice in 75% of the cases.

翻译：联邦学习(FL) 是一种分布式学习模式,它保存用户的数据隐私,同时利用所有参与者的全部数据集。在FL 中,多种模型在客户中独立培训,并集中集中,在一个迭接过程中更新全球模型。虽然这种方法在保护隐私方面是极好的,但FL 仍然有质量问题,如袭击或旁占断层。最近有人试图解决对FL 强力聚合技术的这种质量挑战。然而,最先进的FL (SOTA) 稳健的FL 技术仍然不明确,缺乏全面的研究。因此,为了更好地了解SOTA FL 技术目前的质量状况和挑战,并在存在攻击和断层时集中集中进行。我们进行大规模的经验研究,从多角度调查SOTA FL 质量,模拟断层断层断层断层断层断层断层断层断层数据显示,一个最明显的FL 数据流解析层数据显示,一个最明显的FL 4 数据流流流流数据显示一个最明显的数据。我们研究两个通用的GL,一个在图像断层断层断层断层数据分析中,一个最明显的FL 数据显示,一个我们对FL 4 数据显示一个最精确的数据显示一个最精确的流数据流流流流流流流流数据。