Distributed Stein Variational Gradient Descent (DSVGD) is a non-parametric distributed learning framework for federated Bayesian learning, where multiple clients jointly train a machine learning model by communicating a number of non-random and interacting particles with the server. Since communication resources are limited, selecting the clients with most informative local learning updates can improve the model convergence and communication efficiency. In this paper, we propose two selection schemes for DSVGD based on Kernelized Stein Discrepancy (KSD) and Hilbert Inner Product (HIP). We derive the upper bound on the decrease of the global free energy per iteration for both schemes, which is then minimized to speed up the model convergence. We evaluate and compare our schemes with conventional schemes in terms of model accuracy, convergence speed, and stability using various learning tasks and datasets.
翻译:由于通信资源有限,选择本地学习更新信息最丰富的客户可以改善模式的趋同和通信效率。在本文件中,我们提议基于Kernelized Stein Conference(KSD)和Hilbert Inner Product(HIP)的DSVGD两个选择方案。我们从两种方案每迭代一次全球自由能量的减少中得出上限,然后尽可能降低这一上限,以加快模式趋同速度。我们利用各种学习任务和数据集,从模型精度、趋同速度和稳定性等方面评估和比较我们的计划与常规计划。</s>