Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), enable large-scale posterior inference by leveraging noisy but cheap gradient estimates. However, when federated data are non-IID, the variance of distributed gradient estimates is amplified compared to its centralized version, and delayed communication rounds lead chains to diverge from the target posterior. In this work, we introduce the concept of conducive gradients, zero-mean stochastic gradients that serve as a mechanism for sharing probabilistic information between data shards. We propose a novel stochastic gradient estimator that incorporates the conducive gradients, and we show that it improves convergence on federated data when compared to distributed SGLD (DSGLD). We evaluate, conducive gradient DSGLD (CG-DSGLD) on metric learning and deep MLPs tasks. Experiments show that it outperforms standard DSGLD for non-IID federated data.
翻译:在这项工作中,我们引入了有利的梯度概念,即零平均值的梯度梯度概念,作为数据碎片之间分享概率性信息的机制。我们提议了一个新颖的随机梯度估计器,将有利的梯度估计值纳入其中,并表明它能改善与分布式SGLD(DSGLD)相比,在比照分布式SGLD(DSGLD)时,在粘合式数据方面的趋同性数据趋同性。我们评估了基准学习和深度MLP任务方面的有利于梯度DSGLD(C-DSGLD)的梯度。实验显示,它比非IID联邦数据的标准DSGLD(DD)更符合标准。