In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
翻译:在这项工作中,我们考虑利用计算单位网络对非移动的连接功能进行分布式优化。我们根据两个常规假设调查这一问题:(1) 全球目标功能的Lipschitz连续性,(2) 当地个别功能的Lipschitz连续性。根据地方常规假设,我们提供了第一个最优化的一级分散算法,称为多步初线和双向(MSPD)及其相应的最佳趋同率。这一结果的一个显著方面是,对于非移动功能,错误的主要用值是O(1/sqrt{t})$,而通信网络的结构只对美元(1/t)的第二阶期产生影响,而美元是时间的。换句话说,由于通信资源受限而导致的错误迅速减少,即使在非强烈凝固(MSPD)目标功能的情况下也是如此。根据全球正常假设,我们提供了一种简单但有效的算法,即根据目标功能的本地平滑动而进行分配的随机平滑动(DRS),并表明DRS$只影响O(1/t)美元的第二阶期,因为美元是最高趋同率。