AdaGrad(G-AdaGrad)和Adam:国家空间展望 (Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective) - 专知论文

会员服务 ·

0

AdaGrad · Adam · Performer · Extensibility · Machine Learning ·

2021 年 5 月 31 日

Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective

翻译：AdaGrad(G-AdaGrad)和Adam:国家空间展望

Kushal Chakrabarti,Nikhil Chopra

Accelerated gradient-based methods are being extensively used for solving non-convex machine learning problems, especially when the data points are abundant or the available data is distributed across several agents. Two of the prominent accelerated gradient algorithms are AdaGrad and Adam. AdaGrad is the simplest accelerated gradient method, which is particularly effective for sparse data. Adam has been shown to perform favorably in deep learning problems compared to other methods. In this paper, we propose a new fast optimizer, Generalized AdaGrad (G-AdaGrad), for accelerating the solution of potentially non-convex machine learning problems. Specifically, we adopt a state-space perspective for analyzing the convergence of gradient acceleration algorithms, namely G-AdaGrad and Adam, in machine learning. Our proposed state-space models are governed by ordinary differential equations. We present simple convergence proofs of these two algorithms in the deterministic settings with minimal assumptions. Our analysis also provides intuition behind improving upon AdaGrad's convergence rate. We provide empirical results on MNIST dataset to reinforce our claims on the convergence and performance of G-AdaGrad and Adam.

翻译：加速梯度法正在被广泛用于解决非冷冻机学习问题,特别是当数据点充足或现有数据分布于多个代理商时。两种显著的加速梯度算法是AdaGrad和Adam。AdaGrad是最简单的加速梯度法,对稀有数据特别有效。亚当已证明与其他方法相比,在深层学习问题方面表现优异。在本文中,我们提议一个新的快速优化器,即普遍化的AdaGrad(G-AdaGrad),以加速解决潜在的非冷冻机学习问题。具体地说,我们从州空间角度分析机械学习中梯度加速算法的趋同,即G-AdaGrad和Adam。我们提议的州-空间模型受普通差异方程式的制约。我们用最起码的假设来展示在确定性环境中这两种算法的简单趋同证据。我们的分析还提供了改进AdaGrad和Adam汇合率背后的直觉。我们提供了关于MNIST数据集的经验结果,以加强我们关于G-AdaGrad和Adam的趋同和表现的主张。

1

相关内容

AdaGrad

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder

Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder

Arxiv

0+阅读 · 2021年7月22日

A micromechanics-based variational phase-field model for fracture in geomaterials with brittle-tensile and compressive-ductile behavior

Arxiv

0+阅读 · 2021年7月22日

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Arxiv

0+阅读 · 2021年7月21日

Connecting Constructive Notions of Ordinals in Homotopy Type Theory

Arxiv

0+阅读 · 2021年7月21日

Adaptive Universal Generalized PageRank Graph Neural Network

Arxiv

10+阅读 · 2021年1月22日

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Arxiv

3+阅读 · 2020年5月13日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

Scalable Generalized Dynamic Topic Models

Arxiv

7+阅读 · 2018年3月21日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

VIP会员

文章信息

相关主题

Machine Learning

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

通信行业：智能低空通感网络白皮书

3D形状生成：综述

6000字《伊朗-以色列战争解析：欺骗与信息战如何塑造公众认知》最新报告（附原文）

【博士论文】优化智能体工作流以提升信息获取效率

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder

Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder

Arxiv

0+阅读 · 2021年7月22日

A micromechanics-based variational phase-field model for fracture in geomaterials with brittle-tensile and compressive-ductile behavior

Arxiv

0+阅读 · 2021年7月22日

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Arxiv

0+阅读 · 2021年7月21日

Connecting Constructive Notions of Ordinals in Homotopy Type Theory

Arxiv

0+阅读 · 2021年7月21日

Adaptive Universal Generalized PageRank Graph Neural Network

Arxiv

10+阅读 · 2021年1月22日

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Arxiv

3+阅读 · 2020年5月13日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

Scalable Generalized Dynamic Topic Models

Arxiv

7+阅读 · 2018年3月21日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

微信扫码咨询专知VIP会员