Decentralized SGD and Average-direction SAM are Asymptotically Equivalent - 专知论文

会员服务 ·

0

SGD · 泛化理论 · Learning · 损失函数（机器学习） · 随机梯度下降 ·

2023 年 6 月 5 日

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

翻译：暂无翻译

Tongtian Zhu,Fengxiang He,Kaixuan Chen,Mingli Song,Dacheng Tao

from arxiv, Accepted for publication in the 40th International Conference on Machine Learning (ICML 2023)

Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on massive devices simultaneously without the control of a central server. However, existing theories claim that decentralization invariably undermines generalization. In this paper, we challenge the conventional belief and present a completely new perspective for understanding decentralized learning. We prove that D-SGD implicitly minimizes the loss function of an average-direction Sharpness-aware minimization (SAM) algorithm under general non-convex non-$\beta$-smooth settings. This surprising asymptotic equivalence reveals an intrinsic regularization-optimization trade-off and three advantages of decentralization: (1) there exists a free uncertainty evaluation mechanism in D-SGD to improve posterior estimation; (2) D-SGD exhibits a gradient smoothing effect; and (3) the sharpness regularization effect of D-SGD does not decrease as total batch size increases, which justifies the potential generalization benefit of D-SGD over centralized SGD (C-SGD) in large-batch scenarios.

翻译：暂无翻译

0

相关内容

SGD

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

浮游植物群落结构对水情动态变化响应的区域湖沼学研究

国家自然科学基金

0+阅读 · 2013年12月31日

SAM核酸开关的结构与机理

国家自然科学基金

0+阅读 · 2013年12月31日

平移不变子空间的结构

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

LRIG3靶向多种酪氨酸激酶受体影响胶质瘤生物学特性的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

某些偏微分方程解的零点集结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

原癌基因AEG-1调控胶质瘤细胞凋亡的生物学功能及其分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

一维动力系统的Julia集及其不变子集的维数与熵

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Efficient Estimation of the Local Robustness of Machine Learning Models

Arxiv

0+阅读 · 2023年7月26日

Model Calibration in Dense Classification with Adaptive Label Perturbation

Arxiv

0+阅读 · 2023年7月25日

Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions

Arxiv

0+阅读 · 2023年7月25日

Modify Training Directions in Function Space to Reduce Generalization Error

Arxiv

0+阅读 · 2023年7月25日

EASpace: Enhanced Action Space for Policy Transfer

Arxiv

0+阅读 · 2023年7月25日

On Privileged and Convergent Bases in Neural Network Representations

Arxiv

0+阅读 · 2023年7月24日

Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年7月22日

Convergence of SGD for Training Neural Networks with Sliced Wasserstein Losses

Arxiv

0+阅读 · 2023年7月21日

Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case

Arxiv

0+阅读 · 2023年7月20日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

VIP会员

文章信息

相关主题

损失函数（机器学习）

随机梯度下降

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Efficient Estimation of the Local Robustness of Machine Learning Models

Arxiv

0+阅读 · 2023年7月26日

Model Calibration in Dense Classification with Adaptive Label Perturbation

Arxiv

0+阅读 · 2023年7月25日

Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions

Arxiv

0+阅读 · 2023年7月25日

Modify Training Directions in Function Space to Reduce Generalization Error

Arxiv

0+阅读 · 2023年7月25日

EASpace: Enhanced Action Space for Policy Transfer

Arxiv

0+阅读 · 2023年7月25日

On Privileged and Convergent Bases in Neural Network Representations

Arxiv

0+阅读 · 2023年7月24日

Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年7月22日

Convergence of SGD for Training Neural Networks with Sliced Wasserstein Losses

Arxiv

0+阅读 · 2023年7月21日

Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case

Arxiv

0+阅读 · 2023年7月20日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

相关基金

浮游植物群落结构对水情动态变化响应的区域湖沼学研究

国家自然科学基金

0+阅读 · 2013年12月31日

SAM核酸开关的结构与机理

国家自然科学基金

0+阅读 · 2013年12月31日

平移不变子空间的结构

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

LRIG3靶向多种酪氨酸激酶受体影响胶质瘤生物学特性的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

某些偏微分方程解的零点集结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

原癌基因AEG-1调控胶质瘤细胞凋亡的生物学功能及其分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

一维动力系统的Julia集及其不变子集的维数与熵

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员