带有随机调整和渐变压缩的联邦优化最佳值 (Federated Optimization Algorithms with Random Reshuffling and Gradient Compression) - 专知论文

会员服务 ·

0

Better · Analysis · Learning · 方差 · 样本 ·

2022 年 11 月 3 日

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression

翻译：带有随机调整和渐变压缩的联邦优化最佳值

Abdurakhmon Sadiev,Grigory Malinovsky,Eduard Gorbunov,Igor Sokolov,Ahmed Khaled,Konstantin Burlachenko,Peter Richtárik

from arxiv, 66 pages, 6 figures. Changes in V2: the presentation of the results was changed, extra experiments were added. Code: https://github.com/IgorSokoloff/rr_with_compression_experiments_source_code

Gradient compression is a popular technique for improving communication complexity of stochastic first-order methods in distributed training of machine learning models. However, the existing works consider only with-replacement sampling of stochastic gradients. In contrast, it is well-known in practice and recently confirmed in theory that stochastic methods based on without-replacement sampling, e.g., Random Reshuffling (RR) method, perform better than ones that sample the gradients with-replacement. In this work, we close this gap in the literature and provide the first analysis of methods with gradient compression and without-replacement sampling. We first develop a na\"ive combination of random reshuffling with gradient compression (Q-RR). Perhaps surprisingly, but the theoretical analysis of Q-RR does not show any benefits of using RR. Our extensive numerical experiments confirm this phenomenon. This happens due to the additional compression variance. To reveal the true advantages of RR in the distributed learning with compression, we propose a new method called DIANA-RR that reduces the compression variance and has provably better convergence rates than existing counterparts with with-replacement sampling of stochastic gradients. Next, to have a better fit to Federated Learning applications, we incorporate local computation, i.e., we propose and analyze the variants of Q-RR and DIANA-RR -- Q-NASTYA and DIANA-NASTYA that use local gradient steps and different local and global stepsizes. Finally, we conducted several numerical experiments to illustrate our theoretical results.

翻译：重力压缩是提高机器学习模型分布式培训中随机第一阶方法通信复杂性的流行技术,但现有工作只考虑替换随机梯度抽样。相反,在实践中广为人知,最近在理论中也证实,基于不替换抽样的随机重压方法,例如随机重压方法,比抽取梯度并替换的梯度的方法效果更好。在这项工作中,我们缩小了文献中的这一差距,首次分析了梯度压缩和不替换抽样的方法。我们首先开发了随机重整梯度梯度(Q-RR)的自动组合。也许令人惊讶的是,对QRR的理论分析没有显示使用不替换取样的任何好处。我们的广泛数字实验证实了这一现象。这与额外的压缩差异有关。为了揭示DRIA在通过压缩进行分布学习中的真正优势,我们提出了一种名为DIAN-RRA的新方法,可以减少缩压差和不替换抽样。我们首先开发了随机重力A(Q-RRRA)的自动重整率组合率,并且将我们从现在的渐变的渐变的渐变的阶级和变数。

0

相关内容

Better

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

重离子诱导的水分子与氙原子之间能量传递过程的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

关联电子动力学对分子轨道的依赖及其强场控制研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂不确定环境下的多层规划模型与算法及其在生产控制中的应用

国家自然科学基金

1+阅读 · 2012年12月31日

反应动力学中非绝热效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

RACK1表达下调/缺失在结肠癌发病中的作用及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

气相醇类分子C-H基团高分辨拉曼光声光谱研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型高稳定全光纤NICE-OHMS色散光谱技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

资源约束的高维流数据降维方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

An extension of the Unified Skew-Normal family of distributions and application to Bayesian binary regression

Arxiv

0+阅读 · 2022年12月23日

Gradient boosting for extreme quantile regression

Gradient boosting for extreme quantile regression

Arxiv

0+阅读 · 2022年12月21日

Efficient First-order Methods for Convex Optimization with Strongly Convex Function Constraints

Arxiv

0+阅读 · 2022年12月21日

Differentially Private Decentralized Optimization with Relay Communication

Arxiv

0+阅读 · 2022年12月21日

On the Convergence of Momentum-Based Algorithms for Federated Bilevel Optimization Problems

Arxiv

0+阅读 · 2022年12月21日

FedDAG: Federated DAG Structure Learning

Arxiv

0+阅读 · 2022年12月21日

Locally Weighted Regression with different Kernel Smoothers for Software Effort Estimation

Arxiv

0+阅读 · 2022年9月12日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

An extension of the Unified Skew-Normal family of distributions and application to Bayesian binary regression

Arxiv

0+阅读 · 2022年12月23日

Gradient boosting for extreme quantile regression

Gradient boosting for extreme quantile regression

Arxiv

0+阅读 · 2022年12月21日

Efficient First-order Methods for Convex Optimization with Strongly Convex Function Constraints

Arxiv

0+阅读 · 2022年12月21日

Differentially Private Decentralized Optimization with Relay Communication

Arxiv

0+阅读 · 2022年12月21日

On the Convergence of Momentum-Based Algorithms for Federated Bilevel Optimization Problems

Arxiv

0+阅读 · 2022年12月21日

FedDAG: Federated DAG Structure Learning

Arxiv

0+阅读 · 2022年12月21日

Locally Weighted Regression with different Kernel Smoothers for Software Effort Estimation

Arxiv

0+阅读 · 2022年9月12日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

相关基金

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

重离子诱导的水分子与氙原子之间能量传递过程的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

关联电子动力学对分子轨道的依赖及其强场控制研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂不确定环境下的多层规划模型与算法及其在生产控制中的应用

国家自然科学基金

1+阅读 · 2012年12月31日

反应动力学中非绝热效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

RACK1表达下调/缺失在结肠癌发病中的作用及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

气相醇类分子C-H基团高分辨拉曼光声光谱研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型高稳定全光纤NICE-OHMS色散光谱技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

资源约束的高维流数据降维方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员