We consider a decentralized learning setting in which data is distributed over nodes in a graph. The goal is to learn a global model on the distributed data without involving any central entity that needs to be trusted. While gossip-based stochastic gradient descent (SGD) can be used to achieve this learning objective, it incurs high communication and computation costs, since it has to wait for all the local models at all the nodes to converge. To speed up the convergence, we propose instead to study random walk based SGD in which a global model is updated based on a random walk on the graph. We propose two algorithms based on two types of random walks that achieve, in a decentralized way, uniform sampling and importance sampling of the data. We provide a non-asymptotic analysis on the rate of convergence, taking into account the constants related to the data and the graph. Our numerical results show that the weighted random walk based algorithm has a better performance for high-variance data. Moreover, we propose a privacy-preserving random walk algorithm that achieves local differential privacy based on a Gamma noise mechanism that we propose. We also give numerical results on the convergence of this algorithm and show that it outperforms additive Laplace-based privacy mechanisms.
翻译:我们考虑一个分散的学习设置,将数据分布在图表中的节点上。目标是学习分布数据的全球模型,而不需要任何需要信任的中央实体参与。虽然可以使用八卦的随机梯度下降(SGD)来实现这一学习目标,但它带来很高的通信和计算成本,因为它必须等待所有节点的所有本地模型汇合。为了加快趋同速度,我们提议研究基于随机行走的 SGD 随机行走模式,根据图表上的随机行走方式更新一个全球模式。我们建议基于两种随机行走方式的两种算法,以分散方式实现数据的统一抽样和重要性抽样。我们提供了对趋同率的非随机分析,同时考虑到了与数据和图表相关的常数。我们的数字结果显示,加权随机行走算法对高变异性数据有更好的性。此外,我们提议采用一种隐私保留随机行走算法,根据我们提议的伽玛噪音机制实现本地差异隐私权。我们还给出了这一算法的趋同率的数值,并展示了该变异性法的变异性。