Federated learning (FL) aims to train machine learning models in the decentralized system consisting of an enormous amount of smart edge devices. Federated averaging (FedAvg), the fundamental algorithm in FL settings, proposes on-device training and model aggregation to avoid the potential heavy communication costs and privacy concerns brought by transmitting raw data. However, through theoretical analysis we argue that 1) the multiple steps of local updating will result in gradient biases and 2) there is an inconsistency between the expected target distribution and the optimization objectives following the training paradigm in FedAvg. To tackle these problems, we first propose an unbiased gradient aggregation algorithm with the keep-trace gradient descent and the gradient evaluation strategy. Then we introduce an additional controllable meta updating procedure with a small set of data samples, indicating the expected target distribution, to provide a clear and consistent optimization objective. Both the two improvements are model- and task-agnostic and can be applied individually or together. Experimental results demonstrate that the proposed methods are faster in convergence and achieve higher accuracy with different network architectures in various FL settings.
翻译:联邦学习(FL)的目的是在由大量智能边缘装置组成的分散系统中培训机器学习模式; 平均法(FedAvg)是FL设置的基本算法,它提议进行在线培训和模型汇总,以避免传输原始数据可能造成的通信费用高昂和隐私问题; 然而,我们通过理论分析认为,1) 本地更新的多个步骤会导致梯度偏差; 2 预计在FedAvg培训模式之后的目标分布和优化目标之间不一致。 为了解决这些问题,我们首先提议采用不带梯度梯度梯度梯度下行和梯度评价战略的公平梯度汇总算法。 然后,我们采用另外一套可控制的元更新程序,采用少量的数据样本,说明预期的目标分布,以提供一个明确和一致的优化目标。两种改进都是模型和任务敏感性的,可以单独或一起应用。实验结果表明,所提议的方法在与不同FL环境中的不同网络结构的趋同速度更快,并实现更高的精确度。