探讨培训前和入学在联邦学习中的影响 (Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated Learning)

An oft-cited challenge of federated learning is the presence of data heterogeneity -- the data at different clients may follow very different distributions. Several federated optimization methods have been proposed to address these challenges. In the literature, empirical evaluations usually start federated training from a random initialization. However, in many practical applications of federated learning, the server has access to proxy data for the training task which can be used to pre-train a model before starting federated training. We empirically study the impact of starting from a pre-trained model in federated learning using four common federated learning benchmark datasets. Unsurprisingly, starting from a pre-trained model reduces the training time required to reach a target error rate and enables training more accurate models (by up to 40\%) than is possible than when starting from a random initialization. Surprisingly, we also find that the effect of data heterogeneity is much less significant when starting federated training from a pre-trained initialization. Rather, when starting from a pre-trained model, using an adaptive optimizer at the server, such as \textsc{FedAdam}, consistently leads to the best accuracy. We recommend that future work proposing and evaluating federated optimization methods consider the performance when starting both random and pre-trained initializations. We also believe this study raises several questions for further work on understanding the role of heterogeneity in federated optimization.

翻译：联合会学习的一个常见挑战是存在数据差异性 -- -- 不同客户的数据可能遵循非常不同的分布方法。已经提出了几种联盟优化方法来应对这些挑战。在文献中,经验评价通常从随机初始化开始,从随机初始化开始,进行联合培训培训。然而,在许多联邦学习的实际应用中,服务器可以获取培训任务的代理数据,这些数据可用于在开始联合会培训之前对模型进行预培训。我们从经验上研究从事先培训的联合会化学习模型开始,使用四个共同的联合会学习基准数据集。令人不解的是,从预先培训的模式开始,可以减少达到目标错误率所需的培训时间,使得培训模型比从随机初始化开始时更准确。令人惊讶的是,我们还发现,从培训前的初始化开始,从培训前的模型开始,使用适应性最佳学习基准数据集的模型开始,也可以在初始优化服务器上提出最精确性能,这样可以提高未来最佳性能。我们研究后发现,数据高度性的影响要小得多,从培训初期初始化培训培训开始,从培训前的模型开始,使用适应性最佳性优化的优化模型开始,同时在初始优化服务器上提出最精确化的初始评估前,也提出最佳性评估最佳性工作。我们研究后,作为未来最佳的优化研究。考虑最佳性的工作,研究,研究。我们未来最佳性地评估最佳性地研究。研究,研究。考虑最佳性工作,研究,研究,研究最佳性地分析最佳性研究。研究,研究,研究,研究,研究。研究。研究,研究。