We consider federated learning (FL), where the training data is distributed across a large number of clients. The standard optimization method in this setting is Federated Averaging (FedAvg), which performs multiple local first-order optimization steps between communication rounds. In this work, we evaluate the performance of several second-order distributed methods with local steps in the FL setting which promise to have favorable convergence properties. We (i) show that FedAvg performs surprisingly well against its second-order competitors when evaluated under fair metrics (equal amount of local computations)-in contrast to the results of previous work. Based on our numerical study, we propose (ii) a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.
翻译:我们考虑联合学习(FL),培训数据分布在众多客户中。这种环境下的标准优化方法是Federation Averiging(FedAvg),该方法在沟通周期之间执行多项当地一级优化步骤。在这项工作中,我们评估了在FL设置中若干二级分配方法与当地一级分配方法的性能,这些方法有望具有有利的趋同特性。我们(一)表明,FedAvg在根据公平标准(相等的当地计算)评估其第二级竞争对手时,与以往工作结果相比,其表现令人惊讶。根据我们的数字研究,我们提议(二)一个新变式,利用二级当地信息进行更新,并进行全球线搜索,以抵消由此产生的当地特性。