SGD在国际刑网化制度中为最不发达地区最后的迭代趋同 (Last iterate convergence of SGD for Least-Squares in the Interpolation regime) - 专知论文

会员服务 ·

0

SGD · 再生核希尔伯特空间 · 预测器/决策函数 · 随机梯度下降 · 估计/估计量 ·

2021 年 2 月 5 日

Last iterate convergence of SGD for Least-Squares in the Interpolation regime

翻译：SGD在国际刑网化制度中为最不发达地区最后的迭代趋同

Aditya Varre,Loucas Pillaud-Vivien,Nicolas Flammarion

from arxiv, 24 pages, 1 figure, 1 Appendix

Motivated by the recent successes of neural networks that have the ability to fit the data perfectly and generalize well, we study the noiseless model in the fundamental least-squares setup. We assume that an optimum predictor fits perfectly inputs and outputs $\langle \theta_* , \phi(X) \rangle = Y$, where $\phi(X)$ stands for a possibly infinite dimensional non-linear feature map. To solve this problem, we consider the estimator given by the last iterate of stochastic gradient descent (SGD) with constant step-size. In this context, our contribution is two fold: (i) from a (stochastic) optimization perspective, we exhibit an archetypal problem where we can show explicitly the convergence of SGD final iterate for a non-strongly convex problem with constant step-size whereas usual results use some form of average and (ii) from a statistical perspective, we give explicit non-asymptotic convergence rates in the over-parameterized setting and leverage a fine-grained parameterization of the problem to exhibit polynomial rates that can be faster than $O(1/T)$. The link with reproducing kernel Hilbert spaces is established.

翻译：受最近能够完美和全面地匹配数据的神经网络的成功激励,我们研究了基本最小平方结构中的无噪音模型。我们假设一个最佳预测器完全适合输入和输出$\langle\theta ⁇,\phi(X)\rangle=Y$,$\phi(X)\rangle=Y$,其中$\phi(X)$代表一个可能无限的维度非线性地貌图。为了解决这个问题,我们从统计角度来考虑最后一个迭代的随机梯度梯度下降(SGD)给出的测量器。在这方面,我们的贡献是两个折叠:(i) 从(stochetic)优化的角度,我们展示了一个拱形问题,我们可以明确显示SGD最终的螺旋值与非强性渐变型螺旋问题趋同,而通常的结果则使用某种平均和(ii)的形式。我们从统计角度,我们给出了在过度校准定的定基底基底基底基底基底定和杠杆化的摩擦趋近率。我们把SLI1/Hilneteltal-rocalimalimatealizedal-latexegelationald the the

0

相关内容

SGD

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

关关的刷题日记13——Leetcode 414. Third Maximum Number

关关的刷题日记13——Leetcode 414. Third Maximum Number

专知

3+阅读 · 2017年10月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Convergence on a symmetric accelerated stochastic ADMM with larger stepsizes

Arxiv

0+阅读 · 2021年3月30日

Minimum complexity interpolation in random features models

Arxiv

0+阅读 · 2021年3月30日

The Complexity of Nonconvex-Strongly-Concave Minimax Optimization

Arxiv

0+阅读 · 2021年3月29日

Convergence Analysis of Machine Learning Algorithms for the Numerical Solution of Mean Field Control and Games: I -- The Ergodic Case

Arxiv

0+阅读 · 2021年3月29日

Estimation of ergodic square-root diffusion under high-frequency sampling

Arxiv

0+阅读 · 2021年3月29日

Consensus-based optimization methods converge globally in mean-field law

Arxiv

0+阅读 · 2021年3月28日

Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

Arxiv

0+阅读 · 2021年3月26日

The convergence of the Stochastic Gradient Descent (SGD) : a self-contained proof

Arxiv

0+阅读 · 2021年3月26日

Epidemic change-point detection in general integer-valued time series

Epidemic change-point detection in general integer-valued time series

Arxiv

0+阅读 · 2021年3月26日

The Geometry of Over-parameterized Regression and Adversarial Perturbations

Arxiv

0+阅读 · 2021年3月25日

VIP会员

文章信息

相关主题

再生核希尔伯特空间

预测器/决策函数

随机梯度下降

估计/估计量

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

关关的刷题日记13——Leetcode 414. Third Maximum Number

关关的刷题日记13——Leetcode 414. Third Maximum Number

专知

3+阅读 · 2017年10月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

相关论文

Convergence on a symmetric accelerated stochastic ADMM with larger stepsizes

Arxiv

0+阅读 · 2021年3月30日

Minimum complexity interpolation in random features models

Arxiv

0+阅读 · 2021年3月30日

The Complexity of Nonconvex-Strongly-Concave Minimax Optimization

Arxiv

0+阅读 · 2021年3月29日

Convergence Analysis of Machine Learning Algorithms for the Numerical Solution of Mean Field Control and Games: I -- The Ergodic Case

Arxiv

0+阅读 · 2021年3月29日

Estimation of ergodic square-root diffusion under high-frequency sampling

Arxiv

0+阅读 · 2021年3月29日

Consensus-based optimization methods converge globally in mean-field law

Arxiv

0+阅读 · 2021年3月28日

Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

Arxiv

0+阅读 · 2021年3月26日

The convergence of the Stochastic Gradient Descent (SGD) : a self-contained proof

Arxiv

0+阅读 · 2021年3月26日

Epidemic change-point detection in general integer-valued time series

Epidemic change-point detection in general integer-valued time series

Arxiv

0+阅读 · 2021年3月26日

The Geometry of Over-parameterized Regression and Adversarial Perturbations

Arxiv

0+阅读 · 2021年3月25日

微信扫码咨询专知VIP会员