异步 SGD (Stochastic Gradient Descent) 优于任意时延下的小批量 SGD (Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays) - 专知论文

会员服务 ·

0

SGD · 时延 · 小批量 · 随机梯度下降 · 分析 ·

2023 年 4 月 20 日

Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays

翻译：异步 SGD (Stochastic Gradient Descent) 优于任意时延下的小批量 SGD

Konstantin Mishchenko,Francis Bach,Mathieu Even,Blake Woodworth

The existing analysis of asynchronous stochastic gradient descent (SGD) degrades dramatically when any delay is large, giving the impression that performance depends primarily on the delay. On the contrary, we prove much better guarantees for the same asynchronous SGD algorithm regardless of the delays in the gradients, depending instead just on the number of parallel devices used to implement the algorithm. Our guarantees are strictly better than the existing analyses, and we also argue that asynchronous SGD outperforms synchronous minibatch SGD in the settings we consider. For our analysis, we introduce a novel recursion based on "virtual iterates" and delay-adaptive stepsizes, which allow us to derive state-of-the-art guarantees for both convex and non-convex objectives.

翻译：现有的异步 SGD 分析在任何时延较大时都会大幅降级，给人的印象是性能主要取决于时延。相反，我们证明，在任何梯度延迟时，相同的异步 SGD 算法在取决于实现该算法的并行设备的数量之后，会给出更好的保证。我们的保证严格优于现有的分析，并且我们还认为，在我们考虑的情况下，异步 SGD 要优于同步小批量 SGD。对于我们的分析，我们引入了一种基于“虚拟迭代”和延迟自适应步长的新型递归算法，这使我们能够推导出凸和非凸目标的最新保证。

0

相关内容

SGD

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

基于优化Schwarz算法的非线性预条件问题

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

水声传感器网络的高效时间同步与定位

国家自然科学基金

0+阅读 · 2013年12月31日

基于 HSS 迭代方法的加性 Schwarz 算法

国家自然科学基金

0+阅读 · 2013年12月31日

基于CMOS工艺的10GHz 6bit ADC电路设计及校正方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于压缩网络编码的无线传感网隐私与安全机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

PAK4与SCG10相互作用在胃癌细胞侵袭转移中的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

通信系统中并行多信道ARQ协议的随机模型及其性能分析

国家自然科学基金

0+阅读 · 2011年12月31日

Distribution-Free Matrix Prediction Under Arbitrary Missing Pattern

Arxiv

0+阅读 · 2023年6月6日

Beyond Uniform Lipschitz Condition in Differentially Private Optimization

Arxiv

0+阅读 · 2023年6月6日

How Hard is Asynchronous Weight Reassignment? (Extended Version)

Arxiv

0+阅读 · 2023年6月5日

Improved Stability and Generalization Analysis of the Decentralized SGD Algorithm

Arxiv

0+阅读 · 2023年6月5日

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Arxiv

0+阅读 · 2023年6月5日

Comparative analysis of the existence and uniqueness conditions of parameter estimation in paired comparison models

Arxiv

0+阅读 · 2023年6月5日

Online Bootstrap Inference with Nonconvex Stochastic Gradient Descent Estimator

Arxiv

0+阅读 · 2023年6月3日

Coordinated Defense Allocation in Reach-Avoid Scenarios with Efficient Online Optimization

Arxiv

0+阅读 · 2023年6月3日

CSMAAFL: Client Scheduling and Model Aggregation in Asynchronous Federated Learning

Arxiv

0+阅读 · 2023年6月1日

Multi-task Learning of Order-Consistent Causal Graphs

Arxiv

10+阅读 · 2021年11月3日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

相关论文

Distribution-Free Matrix Prediction Under Arbitrary Missing Pattern

Arxiv

0+阅读 · 2023年6月6日

Beyond Uniform Lipschitz Condition in Differentially Private Optimization

Arxiv

0+阅读 · 2023年6月6日

How Hard is Asynchronous Weight Reassignment? (Extended Version)

Arxiv

0+阅读 · 2023年6月5日

Improved Stability and Generalization Analysis of the Decentralized SGD Algorithm

Arxiv

0+阅读 · 2023年6月5日

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Arxiv

0+阅读 · 2023年6月5日

Comparative analysis of the existence and uniqueness conditions of parameter estimation in paired comparison models

Arxiv

0+阅读 · 2023年6月5日

Online Bootstrap Inference with Nonconvex Stochastic Gradient Descent Estimator

Arxiv

0+阅读 · 2023年6月3日

Coordinated Defense Allocation in Reach-Avoid Scenarios with Efficient Online Optimization

Arxiv

0+阅读 · 2023年6月3日

CSMAAFL: Client Scheduling and Model Aggregation in Asynchronous Federated Learning

Arxiv

0+阅读 · 2023年6月1日

Multi-task Learning of Order-Consistent Causal Graphs

Arxiv

10+阅读 · 2021年11月3日

相关基金

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

基于优化Schwarz算法的非线性预条件问题

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

水声传感器网络的高效时间同步与定位

国家自然科学基金

0+阅读 · 2013年12月31日

基于 HSS 迭代方法的加性 Schwarz 算法

国家自然科学基金

0+阅读 · 2013年12月31日

基于CMOS工艺的10GHz 6bit ADC电路设计及校正方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于压缩网络编码的无线传感网隐私与安全机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

PAK4与SCG10相互作用在胃癌细胞侵袭转移中的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

通信系统中并行多信道ARQ协议的随机模型及其性能分析

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员