Federated learning is typically approached as an optimization problem, where the goal is to minimize a global loss function by distributing computation across client devices that possess local data and specify different parts of the global objective. We present an alternative perspective and formulate federated learning as a posterior inference problem, where the goal is to infer a global posterior distribution by having client devices each infer the posterior of their local data. While exact inference is often intractable, this perspective provides a principled way to search for global optima in federated settings. Further, starting with the analysis of federated quadratic objectives, we develop a computation- and communication-efficient approximate posterior inference algorithm -- federated posterior averaging (FedPA). Our algorithm uses MCMC for approximate inference of local posteriors on the clients and efficiently communicates their statistics to the server, where the latter uses them to refine a global estimate of the posterior mode. Finally, we show that FedPA generalizes federated averaging (FedAvg), can similarly benefit from adaptive optimizers, and yields state-of-the-art results on four realistic and challenging benchmarks, converging faster, to better optima.
翻译:联邦学习通常被视为一个优化问题,目标是通过在拥有当地数据的客户设备之间分配计算方法,并指定全球目标的不同部分,从而最大限度地减少全球损失功能。我们提出了一个替代观点,并将联邦学习作为后推推论问题,目标是通过使用客户设备推导全球后推物分布,每个用户设备推导其当地数据的后推论。虽然准确的推论往往难以解决,但这一视角提供了在联邦环境中寻找全球顺差的有原则的方法。此外,从分析联邦化的二次曲线目标开始,我们开发了一个计算和通信高效的近似后推算法 -- -- 联邦化后推法(FedPA),我们的算法使用MC 来推论当地海后推论者对客户的近似推论,并有效地将其统计数据传递给服务器,后者利用这些工具来改进对后推法模式的全球估计。最后,我们表明,FedPA将联邦化的平均值(FedAvg)也能够从适应性优化中同样受益,并得出更具有挑战性、更迅速的四种选择基准。