用于非电流优化的智多构-SGD的趋同分析 (Convergence Analysis of Homotopy-SGD for non-convex optimization) - 专知论文

会员服务 ·

0

优化器 · SGD · 极小点 · Machine Learning · Neural Networks ·

2020 年 11 月 20 日

Convergence Analysis of Homotopy-SGD for non-convex optimization

翻译：用于非电流优化的智多构-SGD的趋同分析

Matilde Gargiani,Andrea Zanelli,Quoc Tran-Dinh,Moritz Diehl,Frank Hutter

from arxiv, 21 pages, 14 figures, technical report

First-order stochastic methods for solving large-scale non-convex optimization problems are widely used in many big-data applications, e.g. training deep neural networks as well as other complex and potentially non-convex machine learning models. Their inexpensive iterations generally come together with slow global convergence rate (mostly sublinear), leading to the necessity of carrying out a very high number of iterations before the iterates reach a neighborhood of a minimizer. In this work, we present a first-order stochastic algorithm based on a combination of homotopy methods and SGD, called Homotopy-Stochastic Gradient Descent (H-SGD), which finds interesting connections with some proposed heuristics in the literature, e.g. optimization by Gaussian continuation, training by diffusion, mollifying networks. Under some mild assumptions on the problem structure, we conduct a theoretical analysis of the proposed algorithm. Our analysis shows that, with a specifically designed scheme for the homotopy parameter, H-SGD enjoys a global linear rate of convergence to a neighborhood of a minimum while maintaining fast and inexpensive iterations. Experimental evaluations confirm the theoretical results and show that H-SGD can outperform standard SGD.

翻译：在许多大数据应用中,例如培训深神经网络以及其他复杂和潜在的非凝固机器学习模型,都广泛使用第一级测序方法来解决大规模非凝固优化问题,例如,培训深神经网络以及其他复杂和潜在的非凝固机器学习模型,它们的廉价迭代通常伴随着缓慢的全球趋同率(主要是次线性线性),导致有必要进行大量的迭代,在迭代到达一个最小化器附近之前,进行大量的迭代。在这项工作中,我们提出了一个以同质调法和SGD(称为H-SGD)相结合的第一级测序算法。 SGD发现与文献中某些拟议的超常相连接的令人感兴趣,例如高斯的继续优化、通过扩散培训、网络变压。在对问题结构进行一些温和的假设下,我们对拟议的算法进行了理论分析。我们的分析表明,由于专门设计了一个同质调参数计划,H-SGD拥有与一个最起码的相近线性趋同率率,同时保持快速和廉价的理论性SG结果。

0

相关内容

优化器

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

专知会员服务

230+阅读 · 2020年6月5日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

随波逐流：Similarity-Adaptive and Discrete Optimization

随波逐流：Similarity-Adaptive and Discrete Optimization

我爱读PAMI

5+阅读 · 2018年2月6日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Consistency analysis of bilevel data-driven learning in inverse problems

Consistency analysis of bilevel data-driven learning in inverse problems

Arxiv

0+阅读 · 2021年1月7日

An Algorithm for the Discovery of Independence from Data

An Algorithm for the Discovery of Independence from Data

Arxiv

0+阅读 · 2021年1月7日

Federated Learning over Noisy Channels: Convergence Analysis and Design Examples

Federated Learning over Noisy Channels: Convergence Analysis and Design Examples

Arxiv

0+阅读 · 2021年1月6日

Delayed Projection Techniques for Linearly Constrained Problems: Convergence Rates, Acceleration, and Applications

Arxiv

0+阅读 · 2021年1月5日

On the convergence rate of the Kačanov scheme for shear-thinning fluids

Arxiv

0+阅读 · 2021年1月5日

On the global convergence of randomized coordinate gradient descent for non-convex optimization

Arxiv

0+阅读 · 2021年1月5日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

VIP会员

文章信息

相关主题

Machine Learning

Neural Networks

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

专知会员服务

230+阅读 · 2020年6月5日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美陆军五大转型方向

一种Agent自主性风险评估框架 | 最新文献

实时无人机指令处理：一种面向无人机系统的大语言模型方法

基于动态知识图谱的人工智能代理自主研究周期 | 文献

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

随波逐流：Similarity-Adaptive and Discrete Optimization

随波逐流：Similarity-Adaptive and Discrete Optimization

我爱读PAMI

5+阅读 · 2018年2月6日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Consistency analysis of bilevel data-driven learning in inverse problems

Consistency analysis of bilevel data-driven learning in inverse problems

Arxiv

0+阅读 · 2021年1月7日

An Algorithm for the Discovery of Independence from Data

An Algorithm for the Discovery of Independence from Data

Arxiv

0+阅读 · 2021年1月7日

Federated Learning over Noisy Channels: Convergence Analysis and Design Examples

Federated Learning over Noisy Channels: Convergence Analysis and Design Examples

Arxiv

0+阅读 · 2021年1月6日

Delayed Projection Techniques for Linearly Constrained Problems: Convergence Rates, Acceleration, and Applications

Arxiv

0+阅读 · 2021年1月5日

On the convergence rate of the Kačanov scheme for shear-thinning fluids

Arxiv

0+阅读 · 2021年1月5日

On the global convergence of randomized coordinate gradient descent for non-convex optimization

Arxiv

0+阅读 · 2021年1月5日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

微信扫码咨询专知VIP会员