Federated learning (FL) is a fast-developing technique that allows multiple workers to train a global model based on a distributed dataset. Conventional FL (FedAvg) employs gradient descent algorithm, which may not be efficient enough. Momentum is able to improve the situation by adding an additional momentum step to accelerate the convergence and has demonstrated its benefits in both centralized and FL environments. It is well-known that Nesterov Accelerated Gradient (NAG) is a more advantageous form of momentum, but it is not clear how to quantify the benefits of NAG in FL so far. This motives us to propose FedNAG, which employs NAG in each worker as well as NAG momentum and model aggregation in the aggregator. We provide a detailed convergence analysis of FedNAG and compare it with FedAvg. Extensive experiments based on real-world datasets and trace-driven simulation are conducted, demonstrating that FedNAG increases the learning accuracy by 3-24% and decreases the total training time by 11-70% compared with the benchmarks under a wide range of settings.
翻译:联邦学习(FL)是一种快速发展的技术,它使多种工人能够根据分布式数据集来培训一个全球模型。常规FL(FedAvg)采用梯级下限算法,这种算法可能不够有效。动力能够通过增加新的动力步骤来改善情况,加快趋同速度,并表明其在中央和FL环境中的好处。众所周知,Nesterov 加速梯度(NAG)是一种更有利的势头形式,但目前还不清楚如何量化FL中NAG的好处。我们提出FedNAG(FNAG)的动机是:在每名工人中采用NAG,在聚合器中采用NAG的动力和模型集成。我们对FedNAG进行了详细的趋同分析,并与FedAvg进行了比较。根据真实世界数据集和痕量模拟进行的广泛实验表明,FedNAG提高了学习的准确率3-24%,并将总培训时间比广泛环境中的基准减少了11-70%。