最小值$ell ⁇ 1}$-norm 内插器: 精确性无症状和多位血统 (Minimum $\ell_{1}$-norm interpolators: Precise asymptotics and multiple descent) - 专知论文

会员服务 ·

0

极小点 · 特化 · 查准率/准确率 · 训练误差 · 泛化理论 ·

2021 年 10 月 18 日

Minimum $\ell_{1}$-norm interpolators: Precise asymptotics and multiple descent

翻译：最小值$ell ⁇ 1}$-norm 内插器: 精确性无症状和多位血统

Yue Li,Yuting Wei

An evolving line of machine learning works observe empirical evidence that suggests interpolating estimators -- the ones that achieve zero training error -- may not necessarily be harmful. This paper pursues theoretical understanding for an important type of interpolators: the minimum $\ell_{1}$-norm interpolator, which is motivated by the observation that several learning algorithms favor low $\ell_1$-norm solutions in the over-parameterized regime. Concretely, we consider the noisy sparse regression model under Gaussian design, focusing on linear sparsity and high-dimensional asymptotics (so that both the number of features and the sparsity level scale proportionally with the sample size). We observe, and provide rigorous theoretical justification for, a curious multi-descent phenomenon; that is, the generalization risk of the minimum $\ell_1$-norm interpolator undergoes multiple (and possibly more than two) phases of descent and ascent as one increases the model capacity. This phenomenon stems from the special structure of the minimum $\ell_1$-norm interpolator as well as the delicate interplay between the over-parameterized ratio and the sparsity, thus unveiling a fundamental distinction in geometry from the minimum $\ell_2$-norm interpolator. Our finding is built upon an exact characterization of the risk behavior, which is governed by a system of two non-linear equations with two unknowns.

翻译：不断演化的机器学习工作线观察了经验证据,这些证据表明,测算器的内插作用可能不一定有害。本文追求对重要类型的内插器的理论理解:最小值$ell ⁇ 1}美元-北调内插器,其动机是观察到数种学习算法偏向于超参数化制度中低值$1美元-1美元-北调解决方案。具体地说,我们认为高斯设计下的杂乱的稀释回归模型增加了模型能力,侧重于线性宽度和高维度失常性(因此,特征的数量和宽度水平与抽样规模成比例。本文追求对重要类型的内插器的理论理解:最小值$ell_1美元-北调内插器,其严格的理论解释,其原因是,在多参数化制度中,最小值$1美元-北调器最低值的概括性风险和一等增强模型能力。这一现象源于最小值的直位方程式的特殊结构,其非地平方平方平方平面规模是两度的不精确的地平面测量,因此,其基础性对地平方平方平方平方平方平比是两度的对地对地测量。

0

相关内容

极小点

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

Arxiv

0+阅读 · 2021年12月15日

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年12月15日

Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

Arxiv

0+阅读 · 2021年12月14日

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

Arxiv

0+阅读 · 2021年12月13日

Laplace priors and spatial inhomogeneity in Bayesian inverse problems

Arxiv

0+阅读 · 2021年12月10日

Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum

Arxiv

0+阅读 · 2021年12月10日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Arxiv

14+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

Arxiv

0+阅读 · 2021年12月15日

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年12月15日

Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

Arxiv

0+阅读 · 2021年12月14日

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

Arxiv

0+阅读 · 2021年12月13日

Laplace priors and spatial inhomogeneity in Bayesian inverse problems

Arxiv

0+阅读 · 2021年12月10日

Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum

Arxiv

0+阅读 · 2021年12月10日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Arxiv

14+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员