We consider the distributed optimization problem where $n$ agents each possessing a local cost function, collaboratively minimize the average of the $n$ cost functions over a connected network. Assuming stochastic gradient information is available, we study a distributed stochastic gradient algorithm, called exact diffusion with adaptive stepsizes (EDAS) adapted from the Exact Diffusion method and NIDS and perform a non-asymptotic convergence analysis. We not only show that EDAS asymptotically achieves the same network independent convergence rate as centralized stochastic gradient descent (SGD) for minimizing strongly convex and smooth objective functions, but also characterize the transient time needed for the algorithm to approach the asymptotic convergence rate, which behaves as $K_T=\mathcal{O}\left(\frac{n}{1-\lambda_2}\right)$, where $1-\lambda_2$ stands for the spectral gap of the mixing matrix. To the best of our knowledge, EDAS achieves the shortest transient time when the average of the $n$ cost functions is strongly convex and each cost function is smooth. Numerical simulations further corroborate and strengthen the obtained theoretical results.
翻译:我们考虑了分配优化问题,即每个拥有本地成本功能的代理商均拥有当地成本功能,合作将连接网络的成本功能的平均值降至最低。假设存在随机梯度信息,我们研究分布式随机梯度算法,称为根据Exact Difulation 方法和NIDS改编的适应性阶梯(EDAS)精确扩散,并进行非同步趋同分析。我们不仅显示EDAS以简单的方式实现了与集中式随机梯度下降(SGD)相同的网络独立趋同率,以最大限度地减少强电流和平稳客观功能,而且还说明算法接近无干扰趋同率所需的短暂时间,算法表现为$K_T ⁇ mathcal{O ⁇ left(frac{n{%1-\lambda_2 ⁇ right)$,其中$1-lambda_2美元代表混合矩阵的光谱差距。我们所了解的最佳情况是,当平均的美元模拟成本和每个功能都大大增强时,EDPAS达到最短的瞬间时间。