改进分布式散放散散式碎裂梯级方法的瞬时 (Improving the Transient Times for Distributed Stochastic Gradient Methods) - 专知论文

会员服务 ·

0

代价函数 · 泛函 · 代价 · 确切的 · 平滑 ·

2021 年 5 月 11 日

Improving the Transient Times for Distributed Stochastic Gradient Methods

翻译：改进分布式散放散散式碎裂梯级方法的瞬时

Kun Huang,Shi Pu

We consider the distributed optimization problem where $n$ agents each possessing a local cost function, collaboratively minimize the average of the $n$ cost functions over a connected network. Assuming stochastic gradient information is available, we study a distributed stochastic gradient algorithm, called exact diffusion with adaptive stepsizes (EDAS) adapted from the Exact Diffusion method and NIDS and perform a non-asymptotic convergence analysis. We not only show that EDAS asymptotically achieves the same network independent convergence rate as centralized stochastic gradient descent (SGD) for minimizing strongly convex and smooth objective functions, but also characterize the transient time needed for the algorithm to approach the asymptotic convergence rate, which behaves as $K_T=\mathcal{O}\left(\frac{n}{1-\lambda_2}\right)$, where $1-\lambda_2$ stands for the spectral gap of the mixing matrix. To the best of our knowledge, EDAS achieves the shortest transient time when the average of the $n$ cost functions is strongly convex and each cost function is smooth. Numerical simulations further corroborate and strengthen the obtained theoretical results.

翻译：我们考虑了分配优化问题,即每个拥有本地成本功能的代理商均拥有当地成本功能,合作将连接网络的成本功能的平均值降至最低。假设存在随机梯度信息,我们研究分布式随机梯度算法,称为根据Exact Difulation 方法和NIDS改编的适应性阶梯(EDAS)精确扩散,并进行非同步趋同分析。我们不仅显示EDAS以简单的方式实现了与集中式随机梯度下降(SGD)相同的网络独立趋同率,以最大限度地减少强电流和平稳客观功能,而且还说明算法接近无干扰趋同率所需的短暂时间,算法表现为$K_T ⁇ mathcal{O ⁇ left(frac{n{%1-\lambda_2 ⁇ right)$,其中$1-lambda_2美元代表混合矩阵的光谱差距。我们所了解的最佳情况是,当平均的美元模拟成本和每个功能都大大增强时,EDPAS达到最短的瞬间时间。

0

相关内容

代价函数

在数学优化，统计学，计量经济学，决策理论，机器学习和计算神经科学中，代价函数，又叫损失函数或成本函数，它是将一个或多个变量的事件阈值映射到直观地表示与该事件。一个优化问题试图最小化损失函数。目标函数是损失函数或其负值，在这种情况下它将被最大化。

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

专知会员服务

44+阅读 · 2019年10月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

5+阅读 · 2017年11月20日

Sparse recovery by reduced variance stochastic approximation

Arxiv

0+阅读 · 2021年6月28日

Distributed Zero-Order Optimization under Adversarial Noise

Distributed Zero-Order Optimization under Adversarial Noise

Arxiv

0+阅读 · 2021年6月28日

Distributed algorithm for non-convex stochastic optimization with variance reduction

Arxiv

0+阅读 · 2021年6月28日

Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization

Arxiv

0+阅读 · 2021年6月27日

Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

Arxiv

0+阅读 · 2021年6月25日

Private Adaptive Gradient Methods for Convex Optimization

Arxiv

0+阅读 · 2021年6月25日

Distributed IDA-PBC for a Class of Nonholonomic Mechanical Systems

Arxiv

0+阅读 · 2021年6月24日

Stochastic Makespan Minimization in Structured Set Systems

Arxiv

0+阅读 · 2021年6月24日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

专知会员服务

44+阅读 · 2019年10月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

5+阅读 · 2017年11月20日

相关论文

Sparse recovery by reduced variance stochastic approximation

Arxiv

0+阅读 · 2021年6月28日

Distributed Zero-Order Optimization under Adversarial Noise

Distributed Zero-Order Optimization under Adversarial Noise

Arxiv

0+阅读 · 2021年6月28日

Distributed algorithm for non-convex stochastic optimization with variance reduction

Arxiv

0+阅读 · 2021年6月28日

Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization

Arxiv

0+阅读 · 2021年6月27日

Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

Arxiv

0+阅读 · 2021年6月25日

Private Adaptive Gradient Methods for Convex Optimization

Arxiv

0+阅读 · 2021年6月25日

Distributed IDA-PBC for a Class of Nonholonomic Mechanical Systems

Arxiv

0+阅读 · 2021年6月24日

Stochastic Makespan Minimization in Structured Set Systems

Arxiv

0+阅读 · 2021年6月24日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员