ADAGDA: 更快的适应性梯度梯度人后裔最佳优化适应方法 (AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization) - 专知论文

会员服务 ·

0

分解的 · 可约的 · 优化器 · 策略评估 · 讲稿 ·

2022 年 5 月 16 日

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

翻译：ADAGDA: 更快的适应性梯度梯度人后裔最佳优化适应方法

Feihu Huang,Heng Huang

from arxiv, 31 pages, 3 tables, 2 figures

In the paper, we propose a class of faster adaptive Gradient Descent Ascent (GDA) methods for solving the nonconvex-strongly-concave minimax problems by using unified adaptive matrices, which include almost existing coordinate-wise and global adaptive learning rates. In particular, we provide an effective convergence analysis framework for our adaptive GDA methods. Specifically, we propose a fast Adaptive Gradient Descent Ascent (AdaGDA) method based on the basic momentum technique, which reaches a lower gradient complexity of $O(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary point without large batches, which improves the existing results of the adaptive GDA methods by a factor of $O(\sqrt{\kappa})$. At the same time, we present an accelerated version of AdaGDA (VR-AdaGDA) method based on the momentum-based variance reduced technique, which achieves a lower gradient complexity of $O(\kappa^{4.5}\epsilon^{-3})$ for finding an $\epsilon$-stationary point without large batches, which improves the existing results of the adaptive GDA methods by a factor of $O(\epsilon^{-1})$. Moreover, we prove that our VR-AdaGDA method can reach the best known gradient complexity of $O(\kappa^{3}\epsilon^{-3})$ with the mini-batch size $O(\kappa^3)$. Some experimental results on policy evaluation and fair classifier tasks verify efficiency of our algorithms.

翻译：在论文中,我们建议采用一种适应性更迅速的梯度梯度梯度梯度加速法(GDA),通过使用统一的适应矩阵,解决非凝固型小型问题,其中包括几乎现有的协调性和全球适应性学习率。特别是,我们为我们适应性GDA方法提供了一个有效的趋同分析框架。具体地,我们根据基本动力技术,提出了一个快速适应性梯度梯度梯度(AdaGDA)方法(AdaGDA),该方法的梯度复杂性较低,为O(kappa)4\\epsilon*-4}美元(GDA),该方法用于在没有大批量的ODA(S)值调整方法下找到固定点的GDA方法的现有结果。同时,我们提出了一种加速版的ADAGA(V-AGA)方法(VR-AGDA),该方法的梯度复杂性较低,以美元(Kappa)=4.5-Replon_-3}(美元),用于在不使用大量的OLA(x)的变压值(O)的变现变压法,该方法上,该方法的SLA(xx)的变现的变现的变现的GA),该方法可以提高(G_)的变现。

0

相关内容

分解的

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

向列相和蓝相液晶随机激光的等离激元调控与泵光调控研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

人脐带间充质干细胞分化为毛细胞并且与听觉神经元形成突触的研究

国家自然科学基金

0+阅读 · 2013年12月31日

众核编程环境中多模型协同共存的系统化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

一维核（Si、Ge）/壳（碳）结构多孔纳米线、纳米管的可控制备以及高性能储锂研究

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

TRAIL作为治疗银屑病新的药物作用靶点

国家自然科学基金

0+阅读 · 2008年12月31日

Decomposition Problem in Process of Selective Identification and Localization of Voltage Fluctuations Sources in Power Grids

Arxiv

0+阅读 · 2022年7月5日

A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

Arxiv

0+阅读 · 2022年7月5日

Optimizing Safe Flow Decompositions in DAGs

Arxiv

0+阅读 · 2022年7月4日

Anomaly Detection with Adversarially Learned Perturbations of Latent Space

Arxiv

0+阅读 · 2022年7月3日

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Arxiv

0+阅读 · 2022年7月3日

The closest vector problem and the zero-temperature p-spin landscape for lossy compression

Arxiv

0+阅读 · 2022年7月1日

Analysis of Kinetic Models for Label Switching and Stochastic Gradient Descent

Analysis of Kinetic Models for Label Switching and Stochastic Gradient Descent

Arxiv

0+阅读 · 2022年7月1日

Covid-19 detection using transfer learning approach from computed temography images

Arxiv

0+阅读 · 2022年7月1日

Joint Sequential Detection and Isolation for Dependent Data Streams

Arxiv

0+阅读 · 2022年6月30日

Non-iterative Coarse-to-fine Registration based on Single-pass Deep Cumulative Learning

Arxiv

0+阅读 · 2022年6月30日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Decomposition Problem in Process of Selective Identification and Localization of Voltage Fluctuations Sources in Power Grids

Arxiv

0+阅读 · 2022年7月5日

A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

Arxiv

0+阅读 · 2022年7月5日

Optimizing Safe Flow Decompositions in DAGs

Arxiv

0+阅读 · 2022年7月4日

Anomaly Detection with Adversarially Learned Perturbations of Latent Space

Arxiv

0+阅读 · 2022年7月3日

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Arxiv

0+阅读 · 2022年7月3日

The closest vector problem and the zero-temperature p-spin landscape for lossy compression

Arxiv

0+阅读 · 2022年7月1日

Analysis of Kinetic Models for Label Switching and Stochastic Gradient Descent

Analysis of Kinetic Models for Label Switching and Stochastic Gradient Descent

Arxiv

0+阅读 · 2022年7月1日

Covid-19 detection using transfer learning approach from computed temography images

Arxiv

0+阅读 · 2022年7月1日

Joint Sequential Detection and Isolation for Dependent Data Streams

Arxiv

0+阅读 · 2022年6月30日

Non-iterative Coarse-to-fine Registration based on Single-pass Deep Cumulative Learning

Arxiv

0+阅读 · 2022年6月30日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

向列相和蓝相液晶随机激光的等离激元调控与泵光调控研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

人脐带间充质干细胞分化为毛细胞并且与听觉神经元形成突触的研究

国家自然科学基金

0+阅读 · 2013年12月31日

众核编程环境中多模型协同共存的系统化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

一维核（Si、Ge）/壳（碳）结构多孔纳米线、纳米管的可控制备以及高性能储锂研究

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

TRAIL作为治疗银屑病新的药物作用靶点

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员