Lipschitz 之后:全批GD的简单化和超高风险弹道 (Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD) - 专知论文

会员服务 ·

0

通用动力公司 · 泛化理论 · 泛化误差 · 非凸 · Lipschitz ·

2022 年 5 月 8 日

Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD

翻译：Lipschitz 之后:全批GD的简单化和超高风险弹道

Konstantinos E. Nikolakakis,Farzin Haddadpour,Amin Karbasi,Dionysios S. Kalogerias

from arxiv, 33 pages

We provide sharp path-dependent generalization and excess error guarantees for the full-batch Gradient Decent (GD) algorithm on smooth losses (possibly non-Lipschitz, possibly nonconvex). At the heart of our analysis is a new technique for bounding the generalization error of deterministic symmetric algorithms, which implies that average output stability and a bounded expected gradient of the loss at termination lead to generalization. This key result shows that small generalization error occurs at stationary points, and allows us to bypass Lipschitz or sub-Gaussian assumptions on the loss prevalent in previous works. For nonconvex, Polyak-Lojasiewicz (PL), convex and strongly convex losses, we show the explicit dependence of the generalization error in terms of the accumulated path-dependent optimization error, terminal optimization error, number of samples, and number of iterations. For nonconvex smooth losses, we prove that full-batch GD efficiently generalizes close to any stationary point at termination, under the proper choice of a decreasing step size. Further, if the loss is nonconvex but the objective is PL, we derive quadratically vanishing bounds on the generalization error and the corresponding excess risk, for a choice of a large constant step size. For (resp. strongly-) convex smooth losses, we prove that full-batch GD also generalizes for large constant step sizes, and achieves (resp. quadratically) small excess risk while training fast. In all cases, our full-batch GD generalization error and excess risk bounds are strictly tighter than existing bounds for (stochastic) GD, when the loss is smooth (but possibly non-Lipschitz).

翻译：我们的分析核心是将确定性对称算法的总体错误捆绑起来的新方法,这意味着平均产出稳定性和终止时损失的受约束的预期梯度会导致总体化。这个关键结果显示,小一般化错误发生在固定点,并使我们能够绕过Lipschitz或Gaussian对以往工程中普遍存在的损失所作的假设。对于非Convex、Polyak-Lojasiewicz(PL)、Convex和强烈convex损失,我们的分析核心是一种将确定性对称的对称算算算算法的典型错误捆绑在一起的新技术,这意味着平均产出稳定性和终止时损失的预期梯度会导致总体化。对于非Convex平稳损失,我们证明,完全性GD(在正确选择步骤大小的情况下,全速化GD)接近于任何固定点。此外,在平稳度上,最大幅度的Loicereadread-deferal develrial develrial develrial.

0

相关内容

通用动力公司

通用动力公司

通用动力公司（General Dynamics）是一家美国的国防企业集团。2008年时通用动力是世界第五大国防工业承包商。由于近年来不断的扩充和并购其他公司，通用动力现今的组成与面貌已与冷战时期时大不相同。现今通用动力包含三大业务集团：海洋、作战系统和资讯科技集团。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Choquet期望下极限定理及其收敛速度的刻画

国家自然科学基金

0+阅读 · 2015年12月31日

价键理论的轨道优化方法发展及XMVB程序开发

国家自然科学基金

0+阅读 · 2015年12月31日

P-N型CuxO/TiO2异质复合纳米结构的设计、制备及对金属的光生阴极保护研究

国家自然科学基金

0+阅读 · 2015年12月31日

GDF1及GDF3介导的Nodal信号通路相关基因的突变与非综合征型先天性心脏病的相关性研究

国家自然科学基金

0+阅读 · 2014年12月31日

流形上的Bakry-Emery曲率，泛函不等式和热核分析

国家自然科学基金

0+阅读 · 2012年12月31日

MCM3-SYF2复合物对cyclin D1-CDKs调节在星形胶质细胞炎症激活中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

共价键体系中分子聚集体之间的激发态能量传递

国家自然科学基金

0+阅读 · 2011年12月31日

亚椭圆算子的泛函不等式和热核分析

国家自然科学基金

0+阅读 · 2011年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于母血浆胎儿游离DNA产前诊断遗传性耳聋的方法学探索

国家自然科学基金

0+阅读 · 2009年12月31日

From Kernel Methods to Neural Networks: A Unifying Variational Formulation

Arxiv

0+阅读 · 2022年6月29日

Weighted Average-convexity and Cooperative Games

Arxiv

0+阅读 · 2022年6月28日

Robustness Implies Generalization via Data-Dependent Generalization Bounds

Arxiv

0+阅读 · 2022年6月27日

Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness

Arxiv

0+阅读 · 2022年6月27日

Supermodular f-divergences and bounds on lossy compression and generalization error with mutual f-information

Arxiv

0+阅读 · 2022年6月27日

LSOS: Line-search Second-Order Stochastic optimization methods for nonconvex finite sums

Arxiv

0+阅读 · 2022年6月27日

Sharper Sub-Weibull Concentrations

Arxiv

0+阅读 · 2022年6月26日

Accelerated first-order methods for convex optimization with locally Lipschitz continuous gradient

Arxiv

0+阅读 · 2022年6月24日

Regret Bounds for Noise-Free Kernel-Based Bandits

Arxiv

0+阅读 · 2022年6月24日

Information-theoretic generalization bounds for black-box learning algorithms

Arxiv

12+阅读 · 2021年10月4日

VIP会员

文章信息

相关主题

通用动力公司

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

From Kernel Methods to Neural Networks: A Unifying Variational Formulation

Arxiv

0+阅读 · 2022年6月29日

Weighted Average-convexity and Cooperative Games

Arxiv

0+阅读 · 2022年6月28日

Robustness Implies Generalization via Data-Dependent Generalization Bounds

Arxiv

0+阅读 · 2022年6月27日

Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness

Arxiv

0+阅读 · 2022年6月27日

Supermodular f-divergences and bounds on lossy compression and generalization error with mutual f-information

Arxiv

0+阅读 · 2022年6月27日

LSOS: Line-search Second-Order Stochastic optimization methods for nonconvex finite sums

Arxiv

0+阅读 · 2022年6月27日

Sharper Sub-Weibull Concentrations

Arxiv

0+阅读 · 2022年6月26日

Accelerated first-order methods for convex optimization with locally Lipschitz continuous gradient

Arxiv

0+阅读 · 2022年6月24日

Regret Bounds for Noise-Free Kernel-Based Bandits

Arxiv

0+阅读 · 2022年6月24日

Information-theoretic generalization bounds for black-box learning algorithms

Arxiv

12+阅读 · 2021年10月4日

相关基金

Choquet期望下极限定理及其收敛速度的刻画

国家自然科学基金

0+阅读 · 2015年12月31日

价键理论的轨道优化方法发展及XMVB程序开发

国家自然科学基金

0+阅读 · 2015年12月31日

P-N型CuxO/TiO2异质复合纳米结构的设计、制备及对金属的光生阴极保护研究

国家自然科学基金

0+阅读 · 2015年12月31日

GDF1及GDF3介导的Nodal信号通路相关基因的突变与非综合征型先天性心脏病的相关性研究

国家自然科学基金

0+阅读 · 2014年12月31日

流形上的Bakry-Emery曲率，泛函不等式和热核分析

国家自然科学基金

0+阅读 · 2012年12月31日

MCM3-SYF2复合物对cyclin D1-CDKs调节在星形胶质细胞炎症激活中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

共价键体系中分子聚集体之间的激发态能量传递

国家自然科学基金

0+阅读 · 2011年12月31日

亚椭圆算子的泛函不等式和热核分析

国家自然科学基金

0+阅读 · 2011年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于母血浆胎儿游离DNA产前诊断遗传性耳聋的方法学探索

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员