We consider decentralized optimization problems in which a number of agents collaborate to minimize the average of their local functions by exchanging over an underlying communication graph. Specifically, we place ourselves in an asynchronous model where only a random portion of nodes perform computation at each iteration, while the information exchange can be conducted between all the nodes and in an asymmetric fashion. For this setting, we propose an algorithm that combines gradient tracking with a network-level variance reduction (in contrast to variance reduction within each node). This enables each node to track the average of the gradients of the objective functions. Our theoretical analysis shows that the algorithm converges linearly, when the local objective functions are strongly convex, under mild connectivity conditions on the expected mixing matrices. In particular, our result does not require the mixing matrices to be doubly stochastic. In the experiments, we investigate a broadcast mechanism that transmits information from computing nodes to their neighbors, and confirm the linear convergence of our method on both synthetic and real-world datasets.
翻译:我们考虑分散优化问题,多个代理通过在下层通信图上交换信息,以最小化其本地函数的平均值。具体地,我们放置在一种异步模式下,每次迭代仅有一部分随机节点执行计算,而信息交换可以在所有节点之间进行,并以非对称方式进行。对于这种情况,我们提出了一种算法,该算法将梯度跟踪与网络级方差缩减相结合(与每个节点内的方差缩减相反)。这使得每个节点可以跟踪目标函数的梯度平均值。我们的理论分析表明,当本地目标函数是强凸的时,算法是线性收敛的,在预期混合矩阵具有适度连通性条件下。特别的是,我们的结果不需要混合矩阵是双随机的。在实验中,我们研究了一种广播机制,将计算节点的信息传递给其邻居,并在合成和实际数据集上确认了我们的方法的线性收敛性。