Federated Learning (FL) has been widely accepted as the solution for privacy-preserving machine learning without collecting raw data. While new technologies proposed in the past few years do evolve the FL area, unfortunately, the evaluation results presented in these works fall short in integrity and are hardly comparable because of the inconsistent evaluation metrics and experimental settings. In this paper, we propose a holistic evaluation framework for FL called FedEval, and present a benchmarking study on seven state-of-the-art FL algorithms. Specifically, we first introduce the core evaluation taxonomy model, called FedEval-Core, which covers four essential evaluation aspects for FL: Privacy, Robustness, Effectiveness, and Efficiency, with various well-defined metrics and experimental settings. Based on the FedEval-Core, we further develop an FL evaluation platform with standardized evaluation settings and easy-to-use interfaces. We then provide an in-depth benchmarking study between the seven well-known FL algorithms, including FedSGD, FedAvg, FedProx, FedOpt, FedSTC, SecAgg, and HEAgg. We comprehensively analyze the advantages and disadvantages of these algorithms and further identify the suitable practical scenarios for different algorithms, which is rarely done by prior work. Lastly, we excavate a set of take-away insights and future research directions, which are very helpful for researchers in the FL area.
翻译:联邦学习联合会(FL)被广泛接受为不收集原始数据而进行隐私保存机器学习的解决方案。虽然过去几年提出的新技术的确演变了FL领域,但不幸的是,这些工程的评价结果缺乏完整性,而且由于评价指标和实验环境不一致,很难与之相比。在本文件中,我们提议FL(FedEval)全面评价框架,为FedEval(FedEval)提出七种最先进的FL算法的基准研究。具体地说,我们首先引入了核心评价分类模型,称为FedEval-Core(FedEval-Core),其中包括FL的四个基本评价方面:隐私、强力、有效性和效率,以及各种明确界定的计量和实验环境。在FEval-Core(FedEval-Core)的基础上,我们进一步开发了一个有标准化评价设置和易于使用的界面的FL评价框架评价平台。我们随后提供了七种著名的FL算法(包括FedSGD、FedAvg、FedProx(FedOpet)、FSTC(FedSTC)、SeggA)和Pervalat(Perate)以及HAAAAA(HA) ),我们用这些算法的优势和前期的优势分析领域,我们用这些模型分析的优势和前期的优势和前算法分析领域,我们用法分析是这些不同方法分析的优势和最后的定位。