配有缩小缩入的子设置选择: SNR 值低时的粗略线性建模 (Subset Selection with Shrinkage: Sparse Linear Modeling when the SNR is low) - 专知论文

会员服务 ·

0

Performer · 线性模型 · 岭回归 · 稀疏 · 估计/估计量 ·

2022 年 1 月 8 日

Subset Selection with Shrinkage: Sparse Linear Modeling when the SNR is low

翻译：配有缩小缩入的子设置选择: SNR 值低时的粗略线性建模

Rahul Mazumder,Peter Radchenko,Antoine Dedieu

We study a seemingly unexpected and relatively less understood overfitting aspect of a fundamental tool in sparse linear modeling - best subset selection, which minimizes the residual sum of squares subject to a constraint on the number of nonzero coefficients. While the best subset selection procedure is often perceived as the "gold standard" in sparse learning when the signal to noise ratio (SNR) is high, its predictive performance deteriorates when the SNR is low. In particular, it is outperformed by continuous shrinkage methods, such as ridge regression and the Lasso. We investigate the behavior of best subset selection in the high-noise regimes and propose an alternative approach based on a regularized version of the least-squares criterion. Our proposed estimators (a) mitigate, to a large extent, the poor predictive performance of best subset selection in the high-noise regimes; and (b) perform favorably, while generally delivering substantially sparser models, relative to the best predictive models available via ridge regression and the Lasso. We conduct an extensive theoretical analysis of the predictive properties of the proposed approach and provide justification for its superior predictive performance relative to best subset selection when the noise-level is high. Our estimators can be expressed as solutions to mixed integer second order conic optimization problems and, hence, are amenable to modern computational tools from mathematical optimization.

翻译：我们研究一个似乎出乎意料和相对理解较少的偏差基本工具在稀疏线性建模(最佳子集选择)方面似乎出乎意料,而且相对而言理解较少,这一基本工具在细线性建模(最佳子集选择)方面过于适合,在非零系数数量的限制下,将方方块的剩余总和最小值最小值最小值最小值最小值,而最佳子集选择程序在噪音比信号高时往往被视为稀疏学习中的“黄金标准 ”, 其预测性能在SNR低时则会恶化, 特别是,它由于脊脊回归和激光索等连续的缩缩缩缩方法,而表现得远远超出。我们调查了高噪音制度中最佳子集精度选择的行为,并提出了一种基于固定化的最差值最低值标准的替代方法。我们提议的子集选择者(a) 在很大程度上减轻了在高音比子集度选择中的最佳子集值的预测性能; 相对于通过峰值回归和Lasso等方法提供最差的模型。我们对拟议方法的预测性化特性进行广泛的理论分析,并为其高级预测性优化性压性压的精确性计算提供了理由。

0

相关内容

Performer

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

社交网络级联数据流异常检测模型研究

国家自然科学基金

4+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向光学相干层析成像的三维结构化压缩感知方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

超分辨率中的矩阵值算子学习问题

国家自然科学基金

1+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于基尼系数和改进压缩感知的ISAR成像新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

近似稀疏高维非参与半参模型的Dantzig Selector的研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀疏张量学习理论

国家自然科学基金

1+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

图的若干参数及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Practical considerations for specifying a super learner

Practical considerations for specifying a super learner

Arxiv

0+阅读 · 2022年4月19日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

Selection of proposal distributions for multiple importance sampling

Arxiv

0+阅读 · 2022年4月18日

Subset selection for linear mixed models

Arxiv

1+阅读 · 2022年4月18日

Optimal Subsampling for High-dimensional Ridge Regression

Arxiv

0+阅读 · 2022年4月18日

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Arxiv

0+阅读 · 2022年4月18日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Arxiv

0+阅读 · 2022年4月15日

Distributed Reconstruction of Noisy Pooled Data

Arxiv

0+阅读 · 2022年4月14日

Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime

Arxiv

0+阅读 · 2022年4月11日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Practical considerations for specifying a super learner

Practical considerations for specifying a super learner

Arxiv

0+阅读 · 2022年4月19日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

Selection of proposal distributions for multiple importance sampling

Arxiv

0+阅读 · 2022年4月18日

Subset selection for linear mixed models

Arxiv

1+阅读 · 2022年4月18日

Optimal Subsampling for High-dimensional Ridge Regression

Arxiv

0+阅读 · 2022年4月18日

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Inference for Cluster Randomized Experiments with Non-ignorable Cluster Sizes

Arxiv

0+阅读 · 2022年4月18日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Arxiv

0+阅读 · 2022年4月15日

Distributed Reconstruction of Noisy Pooled Data

Arxiv

0+阅读 · 2022年4月14日

Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime

Arxiv

0+阅读 · 2022年4月11日

相关基金

社交网络级联数据流异常检测模型研究

国家自然科学基金

4+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向光学相干层析成像的三维结构化压缩感知方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

超分辨率中的矩阵值算子学习问题

国家自然科学基金

1+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于基尼系数和改进压缩感知的ISAR成像新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

近似稀疏高维非参与半参模型的Dantzig Selector的研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀疏张量学习理论

国家自然科学基金

1+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

图的若干参数及算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员