准和联式随机调整 (Proximal and Federated Random Reshuffling) - 专知论文

会员服务 ·

0

SGD · ONCE · 随机梯度下降 · 正则化项 · CASE ·

2021 年 2 月 12 日

Proximal and Federated Random Reshuffling

翻译：准和联式随机调整

Konstantin Mishchenko,Ahmed Khaled,Peter Richtárik

from arxiv, 21 pages, 2 figures, 3 algorithms

Random Reshuffling (RR), also known as Stochastic Gradient Descent (SGD) without replacement, is a popular and theoretically grounded method for finite-sum minimization. We propose two new algorithms: Proximal and Federated Random Reshuffing (ProxRR and FedRR). The first algorithm, ProxRR, solves composite convex finite-sum minimization problems in which the objective is the sum of a (potentially non-smooth) convex regularizer and an average of $n$ smooth objectives. We obtain the second algorithm, FedRR, as a special case of ProxRR applied to a reformulation of distributed problems with either homogeneous or heterogeneous data. We study the algorithms' convergence properties with constant and decreasing stepsizes, and show that they have considerable advantages over Proximal and Local SGD. In particular, our methods have superior complexities and ProxRR evaluates the proximal operator once per epoch only. When the proximal operator is expensive to compute, this small difference makes ProxRR up to $n$ times faster than algorithms that evaluate the proximal operator in every iteration. We give examples of practical optimization tasks where the proximal operator is difficult to compute and ProxRR has a clear advantage. Finally, we corroborate our results with experiments on real data sets.

翻译：随机重整(RR),又称“不替换的慢速渐变源(SGD)”,是一种流行的、理论上基于理论的限定总和最小化方法。我们提出了两种新的算法:先质和联质随机重整(ProxRR和FedRRR)。第一个算法(ProxRR),解决了复合共解(convex)有限总和最小化问题,其目标是一个(可能非摩特的)顺流调节器和平均平滑目标的总和。我们得到了第二个算法(FedRR),作为ProxR在重新处理分布的问题时应用的一种特殊案例,要么是同质数据,要么是混杂数据。我们用恒定的和递减步骤来研究算法的趋同特性,表明它们比ProxRR(Prox)和本地 SGD(SGD)具有相当大的优势。特别是,我们的方法比较复杂,ProxRR(prox)只对准操作者作一次评估,这样小的差别使得ProxRRR(ProxR)的计算速度比评估前期的精确度优势要快得多。我们用最难的实验者最后都以证实。

0

相关内容

SGD

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

深度学习的对抗攻击与防御方法综述

专知会员服务

99+阅读 · 2020年12月8日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

8+阅读 · 2019年3月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Approximate maximum likelihood estimators for linear regression with design matrix uncertainty

Approximate maximum likelihood estimators for linear regression with design matrix uncertainty

Arxiv

0+阅读 · 2021年4月7日

On the Approximability of Related Machine Scheduling under Arbitrary Precedence

Arxiv

0+阅读 · 2021年4月7日

New Bounds For Distributed Mean Estimation and Variance Reduction

Arxiv

0+阅读 · 2021年4月7日

Prophet Inequalities for I.I.D. Random Variables from an Unknown Distribution

Arxiv

0+阅读 · 2021年4月7日

Federated Bandit: A Gossiping Approach

Arxiv

0+阅读 · 2021年4月7日

Random Reshuffling: Simple Analysis with Vast Improvements

Arxiv

0+阅读 · 2021年4月5日

Semi-Supervised linear regression

Arxiv

0+阅读 · 2021年4月4日

Cluster-based Distributed Augmented Lagrangian Algorithm for a Class of Constrained Convex Optimization Problems

Arxiv

0+阅读 · 2021年4月2日

Worst-case recovery guarantees for least squares approximation using random samples

Arxiv

0+阅读 · 2021年4月1日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

深度学习的对抗攻击与防御方法综述

专知会员服务

99+阅读 · 2020年12月8日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

8+阅读 · 2019年3月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

相关论文

Approximate maximum likelihood estimators for linear regression with design matrix uncertainty

Approximate maximum likelihood estimators for linear regression with design matrix uncertainty

Arxiv

0+阅读 · 2021年4月7日

On the Approximability of Related Machine Scheduling under Arbitrary Precedence

Arxiv

0+阅读 · 2021年4月7日

New Bounds For Distributed Mean Estimation and Variance Reduction

Arxiv

0+阅读 · 2021年4月7日

Prophet Inequalities for I.I.D. Random Variables from an Unknown Distribution

Arxiv

0+阅读 · 2021年4月7日

Federated Bandit: A Gossiping Approach

Arxiv

0+阅读 · 2021年4月7日

Random Reshuffling: Simple Analysis with Vast Improvements

Arxiv

0+阅读 · 2021年4月5日

Semi-Supervised linear regression

Arxiv

0+阅读 · 2021年4月4日

Cluster-based Distributed Augmented Lagrangian Algorithm for a Class of Constrained Convex Optimization Problems

Arxiv

0+阅读 · 2021年4月2日

Worst-case recovery guarantees for least squares approximation using random samples

Arxiv

0+阅读 · 2021年4月1日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

微信扫码咨询专知VIP会员