关于无脊无脊椎回归中双后裔高峰的普遍性问题 (On the Universality of the Double Descent Peak in Ridgeless Regression) - 专知论文

会员服务 ·

0

输入分布 · 线性回归 · Softplus · 线性的 · 几乎必然 ·

2021 年 3 月 3 日

On the Universality of the Double Descent Peak in Ridgeless Regression

翻译：关于无脊无脊椎回归中双后裔高峰的普遍性问题

David Holzmüller

from arxiv, Accepted at ICLR 2021. 9 pages + 34 pages appendix. Changes in v4: ICLR camera ready layout, extended discussion of related work. Experimental results can be reproduced using the code at https://github.com/dholzmueller/universal_double_descent

We prove a non-asymptotic distribution-independent lower bound for the expected mean squared generalization error caused by label noise in ridgeless linear regression. Our lower bound generalizes a similar known result to the overparameterized (interpolating) regime. In contrast to most previous works, our analysis applies to a broad class of input distributions with almost surely full-rank feature matrices, which allows us to cover various types of deterministic or random feature maps. Our lower bound is asymptotically sharp and implies that in the presence of label noise, ridgeless linear regression does not perform well around the interpolation threshold for any of these feature maps. We analyze the imposed assumptions in detail and provide a theory for analytic (random) feature maps. Using this theory, we can show that our assumptions are satisfied for input distributions with a (Lebesgue) density and feature maps given by random deep neural networks with analytic activation functions like sigmoid, tanh, softplus or GELU. As further examples, we show that feature maps from random Fourier features and polynomial kernels also satisfy our assumptions. We complement our theory with further experimental and analytic results.

翻译：与大多数先前的作品不同,我们的分析适用于一系列广泛的输入分布,其中几乎肯定有全端特征矩阵,使我们能够覆盖各种类型的确定性或随机地貌地图。我们的下层线条线性回归在标签噪音、无脊椎线性回归或 GELU 等标志下,不会在任何这些特征地图的内推临界值周围运行良好。我们详细分析所强加的假设,并为分析性(随机)特征地图提供理论。我们使用这一理论可以表明,我们的假设符合输入分布,我们使用的是随机深层神经网络提供的(Lesgue)密度和地貌地图,其中含有像样、凝胶、软加软或GELU这样的分析性激活功能。作为进一步的例子,我们展示了来自随机四极地特征的地貌地图和多层实验性假设结果。我们进一步补充了我们的数据。

0

相关内容

输入分布

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

专知会员服务

20+阅读 · 2020年5月14日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

神经网络中的权重初始化一览：从基础到Kaiming

神经网络中的权重初始化一览：从基础到Kaiming

大数据文摘

12+阅读 · 2019年4月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Logistic回归第二弹——Softmax Regression

Logistic回归第二弹——Softmax Regression

机器学习深度学习实战原创交流

9+阅读 · 2015年10月29日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

A tempered subdiffusive Black-Scholes model

Arxiv

0+阅读 · 2021年4月27日

SSGD: A safe and efficient method of gradient descent

Arxiv

0+阅读 · 2021年4月26日

Linearly Stabilized Schemes for the Time Integration of Stiff Nonlinear PDEs

Arxiv

0+阅读 · 2021年4月26日

Variational Inference in high-dimensional linear regression

Arxiv

0+阅读 · 2021年4月25日

Bayesian Analysis on Limiting the Student-$t$ Linear Regression Model

Arxiv

0+阅读 · 2021年4月25日

Correspondence between neuroevolution and gradient descent

Arxiv

0+阅读 · 2021年4月24日

Achieving Small Test Error in Mildly Overparameterized Neural Networks

Arxiv

0+阅读 · 2021年4月24日

The Geometry of Over-parameterized Regression and Adversarial Perturbations

Arxiv

0+阅读 · 2021年4月23日

Regularized Nonlinear Regression for Simultaneously Selecting and Estimating Key Model Parameters

Arxiv

0+阅读 · 2021年4月23日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

VIP会员

文章信息

相关主题

相关VIP内容

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

专知会员服务

20+阅读 · 2020年5月14日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津大学博士论文】将序列结构与几何结构融入深度神经网络

工程视角：影响战争进程的小型无人机

企业级AI应用开发：从技术选型到生产落地

AI生成代码缺陷综述

相关资讯

神经网络中的权重初始化一览：从基础到Kaiming

神经网络中的权重初始化一览：从基础到Kaiming

大数据文摘

12+阅读 · 2019年4月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Logistic回归第二弹——Softmax Regression

Logistic回归第二弹——Softmax Regression

机器学习深度学习实战原创交流

9+阅读 · 2015年10月29日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

相关论文

A tempered subdiffusive Black-Scholes model

Arxiv

0+阅读 · 2021年4月27日

SSGD: A safe and efficient method of gradient descent

Arxiv

0+阅读 · 2021年4月26日

Linearly Stabilized Schemes for the Time Integration of Stiff Nonlinear PDEs

Arxiv

0+阅读 · 2021年4月26日

Variational Inference in high-dimensional linear regression

Arxiv

0+阅读 · 2021年4月25日

Bayesian Analysis on Limiting the Student-$t$ Linear Regression Model

Arxiv

0+阅读 · 2021年4月25日

Correspondence between neuroevolution and gradient descent

Arxiv

0+阅读 · 2021年4月24日

Achieving Small Test Error in Mildly Overparameterized Neural Networks

Arxiv

0+阅读 · 2021年4月24日

The Geometry of Over-parameterized Regression and Adversarial Perturbations

Arxiv

0+阅读 · 2021年4月23日

Regularized Nonlinear Regression for Simultaneously Selecting and Estimating Key Model Parameters

Arxiv

0+阅读 · 2021年4月23日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

微信扫码咨询专知VIP会员