小型问题存储渐进方法的稳定和普遍化 (Stability and Generalization of Stochastic Gradient Methods for Minimax Problems) - 专知论文

会员服务 ·

0

泛化理论 · 随机梯度下降 · CASES · contrastive · 学成 ·

2021 年 7 月 12 日

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

翻译：小型问题存储渐进方法的稳定和普遍化

Yunwen Lei,Zhenhuan Yang,Tianbao Yang,Yiming Ying

from arxiv, To appear in ICML 2021 as Long Presentation

Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs), AUC maximization and robust estimation, to mention but a few. A substantial amount of studies are devoted to studying the convergence behavior of their stochastic gradient-type algorithms. In contrast, there is relatively little work on their generalization, i.e., how the learning models built from training examples would behave on test examples. In this paper, we provide a comprehensive generalization analysis of stochastic gradient methods for minimax problems under both convex-concave and nonconvex-nonconcave cases through the lens of algorithmic stability. We establish a quantitative connection between stability and several generalization measures both in expectation and with high probability. For the convex-concave setting, our stability analysis shows that stochastic gradient descent ascent attains optimal generalization bounds for both smooth and nonsmooth minimax problems. We also establish generalization bounds for both weakly-convex-weakly-concave and gradient-dominated problems.

翻译：许多机器学习问题可以被描述为小型问题,如基因反转网络(GANs)、AUC最大化和强力估计,仅举几个例子。大量研究致力于研究其随机梯度型算法的趋同行为。相反,关于这些算法的概括化工作相对较少,即从培训实例中建立起来的学习模型如何在试验实例上发挥作用。在本文件中,我们通过算法稳定性的透镜,全面分析在锥形和非锥体-非锥体-非锥体-锥体案例下对微型问题采用的随机梯度方法。我们在预期和极有可能的情况下,在稳定性和若干一般化措施之间建立了定量联系。对于锥形-锥体-锥体环境,我们的稳定分析表明,从显性梯度梯度下降获得最优的光滑和非摩擦微型问题的一般化界限。我们还通过算法稳定化和梯度定的问题建立了一般化界限。

0

相关内容

泛化理论

计算机理论顶会STOC 2021奖项出炉，滕尚华等华人学者获奖

专知会员服务

8+阅读 · 2021年7月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【经典书】现代统计方法基础，267页pdf，Fundamentals of Modern Statistical Methods

【经典书】现代统计方法基础，267页pdf，Fundamentals of Modern Statistical Methods

专知会员服务

64+阅读 · 2020年8月10日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

专知会员服务

142+阅读 · 2020年4月30日

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

专知会员服务

46+阅读 · 2019年11月15日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

清华大学研究生教育

3+阅读 · 2018年6月30日

Non asymptotic controls on a recursive superquantile approximation

Arxiv

0+阅读 · 2021年9月15日

Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Streaming Data

Arxiv

0+阅读 · 2021年9月15日

Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis

Arxiv

0+阅读 · 2021年9月15日

Recovery of a Space-Time Dependent Diffusion Coefficient in Subdiffusion: Stability, Approximation and Error Analysis

Recovery of a Space-Time Dependent Diffusion Coefficient in Subdiffusion: Stability, Approximation and Error Analysis

Arxiv

0+阅读 · 2021年9月14日

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

Arxiv

0+阅读 · 2021年9月14日

Runtime Analysis of Single- and Multi-Objective Evolutionary Algorithms for Chance Constrained Optimization Problems with Normally Distributed Random Variables

Arxiv

1+阅读 · 2021年9月13日

Toward Communication Efficient Adaptive Gradient Method

Arxiv

0+阅读 · 2021年9月10日

Why Do Local Methods Solve Nonconvex Problems?

Arxiv

12+阅读 · 2021年3月24日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

计算机理论顶会STOC 2021奖项出炉，滕尚华等华人学者获奖

专知会员服务

8+阅读 · 2021年7月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【经典书】现代统计方法基础，267页pdf，Fundamentals of Modern Statistical Methods

【经典书】现代统计方法基础，267页pdf，Fundamentals of Modern Statistical Methods

专知会员服务

64+阅读 · 2020年8月10日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

专知会员服务

142+阅读 · 2020年4月30日

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

专知会员服务

46+阅读 · 2019年11月15日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

清华大学研究生教育

3+阅读 · 2018年6月30日

相关论文

Non asymptotic controls on a recursive superquantile approximation

Arxiv

0+阅读 · 2021年9月15日

Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Streaming Data

Arxiv

0+阅读 · 2021年9月15日

Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis

Arxiv

0+阅读 · 2021年9月15日

Recovery of a Space-Time Dependent Diffusion Coefficient in Subdiffusion: Stability, Approximation and Error Analysis

Recovery of a Space-Time Dependent Diffusion Coefficient in Subdiffusion: Stability, Approximation and Error Analysis

Arxiv

0+阅读 · 2021年9月14日

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

Arxiv

0+阅读 · 2021年9月14日

Runtime Analysis of Single- and Multi-Objective Evolutionary Algorithms for Chance Constrained Optimization Problems with Normally Distributed Random Variables

Arxiv

1+阅读 · 2021年9月13日

Toward Communication Efficient Adaptive Gradient Method

Arxiv

0+阅读 · 2021年9月10日

Why Do Local Methods Solve Nonconvex Problems?

Arxiv

12+阅读 · 2021年3月24日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

微信扫码咨询专知VIP会员