Motivated by decentralized approaches to machine learning, we propose a collaborative Bayesian learning algorithm taking the form of decentralized Langevin dynamics in a non-convex setting. Our analysis show that the initial KL-divergence between the Markov Chain and the target posterior distribution is exponentially decreasing while the error contributions to the overall KL-divergence from the additive noise is decreasing in polynomial time. We further show that the polynomial-term experiences speed-up with number of agents and provide sufficient conditions on the time-varying step-sizes to guarantee convergence to the desired distribution. The performance of the proposed algorithm is evaluated on a wide variety of machine learning tasks. The empirical results show that the performance of individual agents with locally available data is on par with the centralized setting with considerable improvement in the convergence rate.
翻译:我们的分析表明,Markov链与目标后部分布之间的最初KL-Diverence 正在急剧下降,而从添加性噪音对整个KL-Diverence的贡献在多元时间里则在减少。我们进一步表明,多年度周期经验加速了代理器的数量,并为时间变化的分级级提供了充分的条件,以保证与预期分布相趋同。提议的算法的性能被评估为各种各样的机器学习任务。经验结果表明,掌握当地现有数据的个别代理商的性能与集中环境相当,趋同率有了相当大的改善。