用于线性后退的常数大小 SGD 防偏重 (Benign Overfitting of Constant-Stepsize SGD for Linear Regression) - 专知论文

会员服务 ·

0

SGD · 过拟合 · 协方差矩阵 · 线性回归 · 线性的 ·

2021 年 3 月 23 日

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

翻译：用于线性后退的常数大小 SGD 防偏重

Difan Zou,Jingfeng Wu,Vladimir Braverman,Quanquan Gu,Sham M. Kakade

from arxiv, 53 pages

There is an increasing realization that algorithmic inductive biases are central in preventing overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized settings for natural learning algorithms, such as stochastic gradient descent (SGD), where little to no explicit regularization has been employed. This work considers this issue in arguably the most basic setting: constant-stepsize SGD (with iterate averaging) for linear regression in the overparameterized regime. Our main result provides a sharp excess risk bound, stated in terms of the full eigenspectrum of the data covariance matrix, that reveals a bias-variance decomposition characterizing when generalization is possible: (i) the variance bound is characterized in terms of an effective dimension (specific for SGD) and (ii) the bias bound provides a sharp geometric characterization in terms of the location of the initial iterate (and how it aligns with the data covariance matrix). We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares (minimum-norm interpolation) and ridge regression.

翻译：人们日益认识到,算法引导偏差是防止过分适应的核心;从经验上看,我们常常看到在诸如随机梯度梯度下降(SGD)等自然学习算法的超分化环境中,一种可喜的过度适应现象,在这种环境中,很少甚至没有采用明确的正规化。这项工作在最基本的环境中审议了这一问题:在超分制制度中,以恒定的步态SGD(以中位平均数)为线性回归法,我们的主要结果提供了明显的超重风险,以数据共变矩阵的全部偏差为表示,表明在可能普遍化时存在偏差变异性分解特征:(一) 差异的特征是有效维度(具体针对SGD),和(二) 偏差的界限在最初的 Iterate 位置(以及它如何与数据变差矩阵一致)上提供了鲜明的几何特征。我们思考了(不正规化的) SGD与普通的最不正方形(im-normolation)相比,SGD提供的算法正规化与回归(im-colizaliztion)之间的若干明显差异。

0

相关内容

SGD

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

低秩稀疏矩阵优化问题的模型与算法

专知会员服务

46+阅读 · 2020年7月29日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

5+阅读 · 2018年11月27日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

On Convex Clustering Solutions

Arxiv

0+阅读 · 2021年5月18日

Barriers for recent methods in geodesic optimization

Arxiv

0+阅读 · 2021年5月17日

Uniform-in-Submodel Bounds for Linear Regression in a Model Free Framework

Arxiv

0+阅读 · 2021年5月17日

Trust Region Method for Coupled Systems of PDE Solvers and Deep Neural Networks

Arxiv

0+阅读 · 2021年5月17日

Analysis of target data-dependent greedy kernel algorithms: Convergence rates for $f$-, $f \cdot P$- and $f/P$-greedy

Arxiv

0+阅读 · 2021年5月16日

Performance of Empirical Risk Minimization for Linear Regression with Dependent Data

Arxiv

0+阅读 · 2021年5月15日

Inference on function-valued parameters using a restricted score test

Arxiv

0+阅读 · 2021年5月14日

How to effectively use machine learning models to predict the solutions for optimization problems: lessons from loss function

Arxiv

0+阅读 · 2021年5月14日

Partially Observed Dynamic Tensor Response Regression

Arxiv

0+阅读 · 2021年5月13日

The Dynamics of Gradient Descent for Overparametrized Neural Networks

Arxiv

0+阅读 · 2021年5月13日

VIP会员

文章信息

相关主题

协方差矩阵

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

低秩稀疏矩阵优化问题的模型与算法

专知会员服务

46+阅读 · 2020年7月29日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

5+阅读 · 2018年11月27日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

On Convex Clustering Solutions

Arxiv

0+阅读 · 2021年5月18日

Barriers for recent methods in geodesic optimization

Arxiv

0+阅读 · 2021年5月17日

Uniform-in-Submodel Bounds for Linear Regression in a Model Free Framework

Arxiv

0+阅读 · 2021年5月17日

Trust Region Method for Coupled Systems of PDE Solvers and Deep Neural Networks

Arxiv

0+阅读 · 2021年5月17日

Analysis of target data-dependent greedy kernel algorithms: Convergence rates for $f$-, $f \cdot P$- and $f/P$-greedy

Arxiv

0+阅读 · 2021年5月16日

Performance of Empirical Risk Minimization for Linear Regression with Dependent Data

Arxiv

0+阅读 · 2021年5月15日

Inference on function-valued parameters using a restricted score test

Arxiv

0+阅读 · 2021年5月14日

How to effectively use machine learning models to predict the solutions for optimization problems: lessons from loss function

Arxiv

0+阅读 · 2021年5月14日

Partially Observed Dynamic Tensor Response Regression

Arxiv

0+阅读 · 2021年5月13日

The Dynamics of Gradient Descent for Overparametrized Neural Networks

Arxiv

0+阅读 · 2021年5月13日

微信扫码咨询专知VIP会员