SGD在最低广场问题中的隐性规范化的好处 (The Benefits of Implicit Regularization from SGD in Least Squares Problems) - 专知论文

会员服务 ·

0

SGD · 正则化项 · 岭回归 · 泛化理论 · 方阵 ·

2022 年 7 月 11 日

The Benefits of Implicit Regularization from SGD in Least Squares Problems

翻译：SGD在最低广场问题中的隐性规范化的好处

Difan Zou,Jingfeng Wu,Vladimir Braverman,Quanquan Gu,Dean P. Foster,Sham M. Kakade

from arxiv, 33 pages, 1 figure. In NeurIPS 2021

Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches. In this work, we seek to understand these issues in the simpler setting of linear regression (including both underparameterized and overparameterized regimes), where our goal is to make sharp instance-based comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression. For a broad class of least squares problem instances (that are natural in high-dimensional settings), we show: (1) for every problem instance and for every ridge parameter, (unregularized) SGD, when provided with logarithmically more samples than that provided to the ridge algorithm, generalizes no worse than the ridge solution (provided SGD uses a tuned constant stepsize); (2) conversely, there exist instances (in this wide problem class) where optimally-tuned ridge regression requires quadratically more samples than SGD in order to have the same generalization performance. Taken together, our results show that, up to the logarithmic factors, the generalization performance of SGD is always no worse than that of ridge regression in a wide range of overparameterized problems, and, in fact, could be much better for some problem instances. More generally, our results show how algorithmic regularization has important consequences even in simpler (overparameterized) convex settings.

翻译：在这项工作中,我们力求在更简单的线性回归(包括分度过低和过度分度制度)设置中理解这些问题,我们的目标是对(非常规)平均SGD提供的隐性回归(包括分度过低和过度分度制度)进行急剧的基于实例的比较,同时对脊脊回归进行明确的规范化。对于广义的平方问题案例(在高维环境中是自然的),我们展示:(1) 对于每个问题实例和每个脊脊参数,(非常规)SGD,当我们向线性回归(包括分度过低和超度偏度制度)提供比向峰性回归法提供的对数更多的样本时,我们力求理解这些问题。 (2) 反之,有些(在这种广泛的问题类别中),最佳调整的脊重回归需要比SGD多的样本,以便具有相同的概括性表现。一起是,我们的结果甚至更糟糕的是,在更简单的正标性回归法方面,在更精确的精确性因素中,在更普遍的正标性回归中,在更精确的精确性因素中,在更差的回归中,在更难于更精确的精确的精确性因素中,在更甚于更深的回归性因素中,在总的精确性因素中,在更甚于更深的回归性因素中,在更甚于更深的回归性因素中,在更甚于更甚于更甚于更深地的精确性上。

0

相关内容

SGD

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ATP13A2基因亚型Ala746Thr和Thr12met突变与新疆维吾尔族早发型和家族型帕金森病临床的相关研究

国家自然科学基金

0+阅读 · 2014年12月31日

抑制Kupffer细胞RIP140表达诱导内毒素耐受减轻肝移植缺血再灌注损伤的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

过渡金属催化选择性C-O/C-N偶联反应

国家自然科学基金

0+阅读 · 2013年12月31日

新因子hARAP3在AR介导基因转录调控及前列腺癌中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

前列腺癌中Nedd4L对TrkA的抑癌性泛素化研究

国家自然科学基金

0+阅读 · 2012年12月31日

在生产成本是凸函数下的最优库存控制

国家自然科学基金

0+阅读 · 2011年12月31日

稳定高效的膦手性PCP类Pincer型催化剂的合成及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

附睾蛋白酶抑制剂(EPPIN)基因转录调控的分子机理

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Mather理论与Hamilton系统的不稳定性

国家自然科学基金

0+阅读 · 2008年12月31日

Using aromas to search for preserved measures and integrals in Kahan's method

Arxiv

0+阅读 · 2022年9月2日

On the Complexity of Robust Multi-Stage Problems in the Polynomial Hierarchy

Arxiv

1+阅读 · 2022年9月2日

A Framework for Supervised Heterogeneous Transfer Learning using Dynamic Distribution Adaptation and Manifold Regularization

Arxiv

0+阅读 · 2022年9月2日

Optimistic Optimization of Gaussian Process Samples

Arxiv

0+阅读 · 2022年9月2日

Lasso Inference for High-Dimensional Time Series

Arxiv

0+阅读 · 2022年9月1日

The Impact of Batch Learning in Stochastic Linear Bandits

The Impact of Batch Learning in Stochastic Linear Bandits

Arxiv

0+阅读 · 2022年9月1日

The Geometry and Calculus of Losses

Arxiv

0+阅读 · 2022年9月1日

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Arxiv

0+阅读 · 2022年9月1日

The Selectively Adaptive Lasso

Arxiv

0+阅读 · 2022年8月31日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Using aromas to search for preserved measures and integrals in Kahan's method

Arxiv

0+阅读 · 2022年9月2日

On the Complexity of Robust Multi-Stage Problems in the Polynomial Hierarchy

Arxiv

1+阅读 · 2022年9月2日

A Framework for Supervised Heterogeneous Transfer Learning using Dynamic Distribution Adaptation and Manifold Regularization

Arxiv

0+阅读 · 2022年9月2日

Optimistic Optimization of Gaussian Process Samples

Arxiv

0+阅读 · 2022年9月2日

Lasso Inference for High-Dimensional Time Series

Arxiv

0+阅读 · 2022年9月1日

The Impact of Batch Learning in Stochastic Linear Bandits

The Impact of Batch Learning in Stochastic Linear Bandits

Arxiv

0+阅读 · 2022年9月1日

The Geometry and Calculus of Losses

Arxiv

0+阅读 · 2022年9月1日

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Arxiv

0+阅读 · 2022年9月1日

The Selectively Adaptive Lasso

Arxiv

0+阅读 · 2022年8月31日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

相关基金

ATP13A2基因亚型Ala746Thr和Thr12met突变与新疆维吾尔族早发型和家族型帕金森病临床的相关研究

国家自然科学基金

0+阅读 · 2014年12月31日

抑制Kupffer细胞RIP140表达诱导内毒素耐受减轻肝移植缺血再灌注损伤的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

过渡金属催化选择性C-O/C-N偶联反应

国家自然科学基金

0+阅读 · 2013年12月31日

新因子hARAP3在AR介导基因转录调控及前列腺癌中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

前列腺癌中Nedd4L对TrkA的抑癌性泛素化研究

国家自然科学基金

0+阅读 · 2012年12月31日

在生产成本是凸函数下的最优库存控制

国家自然科学基金

0+阅读 · 2011年12月31日

稳定高效的膦手性PCP类Pincer型催化剂的合成及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

附睾蛋白酶抑制剂(EPPIN)基因转录调控的分子机理

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Mather理论与Hamilton系统的不稳定性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员