This work introduces DADAO: the first decentralized, accelerated, asynchronous, primal, first-order algorithm to minimize a sum of $L$-smooth and $\mu$-strongly convex functions distributed over a given network of size $n$. Our key insight is based on modeling the local gradient updates and gossip communication procedures with separate independent Poisson Point Processes. This allows us to decouple the computation and communication steps, which can be run in parallel, while making the whole approach completely asynchronous, leading to communication acceleration compared to synchronous approaches. Our new method employs primal gradients and does not use a multi-consensus inner loop nor other ad-hoc mechanisms such as Error Feedback, Gradient Tracking, or a Proximal operator. By relating the inverse of the smallest positive eigenvalue of the Laplacian matrix $\chi_1$ and the maximal resistance $\chi_2\leq \chi_1$ of the graph to a sufficient minimal communication rate between the nodes of the network, we show that our algorithm requires $\mathcal{O}(n\sqrt{\frac{L}{\mu}}\log(\frac{1}{\epsilon}))$ local gradients and only $\mathcal{O}(n\sqrt{\chi_1\chi_2}\sqrt{\frac{L}{\mu}}\log(\frac{1}{\epsilon}))$ communications to reach a precision $\epsilon$, up to logarithmic terms. Thus, we simultaneously obtain an accelerated rate for both computations and communications, leading to an improvement over state-of-the-art works, our simulations further validating the strength of our relatively unconstrained method. We also propose a SDP relaxation to find the optimal gossip rate of each edge minimizing the total number of communications for a given graph, resulting in faster convergence compared to standard approaches relying on uniform communication weights. Our source code is released on a public repository.
翻译:这项工作引入了 DADO : 第一个分散、 加速、 平流、 平流、 一级算法, 以最大限度地减少在某个大小网络中分布的 $L$- smooth 和 $\ mua- 坚固的 convex 函数。 我们的关键洞察力基于以独立的 Poisson Point 进程模拟本地梯度更新和八卦沟通程序。 这让我们可以拆分计算和沟通步骤, 可以平行运行, 同时使整个方法完全平流, 导致通信加速, 与同步方法相比。 我们的新方法使用原始梯度梯度和 $\ mum- commercial 函数, 不使用多consensius 内环 或其它热机制, 如错误反馈、 梯度跟踪或Proximal 操作。 通过将拉普拉帕西亚矩阵最小正值的逆值 $\ chi_ 1, 和 最大阻力 $( lex) leqqq) 强度( leqq) 坚固) 强度=xxxxx commotion commotion commotion commotion commotion commotion commotion commotion commotion commotional commotional commotion commotion comm commal_ lemental_ lemental_ rational_ rational_ rations)