改进用于模型选择和预测的激光索 (Improving Lasso for model selection and prediction) - 专知论文

会员服务 ·

0

模型选择 · MoDELS · 规范化的 · 线性的 · 求逆 ·

2021 年 1 月 25 日

Improving Lasso for model selection and prediction

翻译：改进用于模型选择和预测的激光索

Piotr Pokarowski,Wojciech Rejchel,Agnieszka Soltys,Michal Frej,Jan Mielniczuk

It is known that the Thresholded Lasso (TL), SCAD or MCP correct intrinsic estimation bias of the Lasso. In this paper we propose an alternative method of improving the Lasso for predictive models with general convex loss functions which encompass normal linear models, logistic regression, quantile regression or support vector machines. For a given penalty we order the absolute values of the Lasso non-zero coefficients and then select the final model from a small nested family by the Generalized Information Criterion. We derive exponential upper bounds on the selection error of the method. These results confirm that, at least for normal linear models, our algorithm seems to be the benchmark for the theory of model selection as it is constructive, computationally efficient and leads to consistent model selection under weak assumptions. Constructivity of the algorithm means that, in contrast to the TL, SCAD or MCP, consistent selection does not rely on the unknown parameters as the cone invertibility factor. Instead, our algorithm only needs the sample size, the number of predictors and an upper bound on the noise parameter. We show in numerical experiments on synthetic and real-world data sets that an implementation of our algorithm is more accurate than implementations of studied concave regularizations. Our procedure is contained in the R package "DMRnet" and available on the CRAN repository.

翻译：众所周知, Lasso(TL)、 SCAD 或 MCP 的临界值是Lasso (TL)、 SCAD 或 MCP 的内在估计偏差。在本文中,我们建议了另一种方法来改进Lasso 用于预测模型的Lasso 的方法,该模型具有一般的 convex 损失功能,包括正常线性模型、后勤回归、四分回归或支持矢量机。对于给定的处罚,我们用通用信息标准从一个小巢状的大家庭中订购Lasso非零系数的绝对值,然后从中选择最后的模型。我们从方法的选择错误中得出指数性的上限值。这些结果证实,至少对于正常的线性模型来说,我们的算法似乎是模型选择理论的基准,因为它具有建设性、计算效率,并且导致在薄弱的假设下得出一致的模型选择模式。算法的构造意味着,与TL、 SCAD 或 MCP 的绝对值相比, 一致选择并不依赖于未知的参数。相反,我们的算法仅需要样本大小、预测器和噪音参数的上限。我们在合成和真实的 R-D CD 程序上,我们研究的CRMRMAR 的配置中的计算方法的精确化程序。

0

相关内容

模型选择

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Confidence Regions Near Singular Information and Boundary Points With Applications to Mixed Models

Arxiv

0+阅读 · 2021年3月18日

Marginal and Conditional Multiple Inference for Linear Mixed Model Predictors

Arxiv

0+阅读 · 2021年3月18日

Decision Theoretic Bootstrapping

Arxiv

0+阅读 · 2021年3月18日

Constructing confidence sets after lasso selection by randomized estimator augmentation

Arxiv

0+阅读 · 2021年3月17日

Data-driven nonintrusive reduced order modeling for dynamical systems with moving boundaries using Gaussian process regression

Arxiv

0+阅读 · 2021年3月17日

Sparse multivariate regression with missing values and its application to the prediction of material properties

Arxiv

0+阅读 · 2021年3月17日

Robust Model-Based Clustering

Arxiv

0+阅读 · 2021年3月16日

Self-Validated Ensemble Models for Design of Experiments

Arxiv

0+阅读 · 2021年3月16日

Uncertainty Sets for Image Classifiers using Conformal Prediction

Arxiv

0+阅读 · 2021年3月16日

Label Embedded Dictionary Learning for Image Classification

Label Embedded Dictionary Learning for Image Classification

Arxiv

6+阅读 · 2019年3月7日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

相关论文

Confidence Regions Near Singular Information and Boundary Points With Applications to Mixed Models

Arxiv

0+阅读 · 2021年3月18日

Marginal and Conditional Multiple Inference for Linear Mixed Model Predictors

Arxiv

0+阅读 · 2021年3月18日

Decision Theoretic Bootstrapping

Arxiv

0+阅读 · 2021年3月18日

Constructing confidence sets after lasso selection by randomized estimator augmentation

Arxiv

0+阅读 · 2021年3月17日

Data-driven nonintrusive reduced order modeling for dynamical systems with moving boundaries using Gaussian process regression

Arxiv

0+阅读 · 2021年3月17日

Sparse multivariate regression with missing values and its application to the prediction of material properties

Arxiv

0+阅读 · 2021年3月17日

Robust Model-Based Clustering

Arxiv

0+阅读 · 2021年3月16日

Self-Validated Ensemble Models for Design of Experiments

Arxiv

0+阅读 · 2021年3月16日

Uncertainty Sets for Image Classifiers using Conformal Prediction

Arxiv

0+阅读 · 2021年3月16日

Label Embedded Dictionary Learning for Image Classification

Label Embedded Dictionary Learning for Image Classification

Arxiv

6+阅读 · 2019年3月7日

微信扫码咨询专知VIP会员