以损失为指南的稳定选择 (Loss-guided Stability Selection)

In modern data analysis, sparse model selection becomes inevitable once the number of predictors variables is very high. It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data. The celebrated Stability Selection overcomes these weaknesses by aggregating models, based on subsamples of the training data, followed by choosing a stable predictor set which is usually much sparser than the predictor sets from the raw models. The standard Stability Selection is based on a global criterion, namely the per-family error rate, while additionally requiring expert knowledge to suitably configure the hyperparameters. Since model selection depends on the loss function, i.e., predictor sets selected w.r.t. some particular loss function differ from those selected w.r.t. some other loss function, we propose a Stability Selection variant which respects the chosen loss function via an additional validation step based on out-of-sample validation data, optionally enhanced with an exhaustive search strategy. Our Stability Selection variants are widely applicable and user-friendly. Moreover, our Stability Selection variants can avoid the issue of severe underfitting which affects the original Stability Selection for noisy high-dimensional data, so our priority is not to avoid false positives at all costs but to result in a sparse stable model with which one can make predictions. Experiments where we consider both regression and binary classification and where we use Boosting as model selection algorithm reveal a significant precision improvement compared to raw Boosting models while not suffering from any of the mentioned issues of the original Stability Selection.

翻译：在现代数据分析中,一旦预测值变量的数量非常高,稀有的模型选择就会变得不可避免。众所周知,拉索或拉普斯(Lasso)等模型选择程序往往会过度使用真实数据。值得庆祝的稳定选择会通过根据培训数据子样集汇总模型克服这些弱点,然后选择一个稳定预测数组,通常比原始模型的预测数组少得多。标准稳定选择是基于一个全球标准,即每个家庭误差率,同时额外要求专家知识来适当配置超参数。由于模型选择取决于损失函数,即预测或设置选定的 w.r.t. 某些特定损失函数不同于选定的 w.r.t. 其他一些损失函数,我们提议了一个稳定选择数变量的变式,通过一个额外的验证步骤来尊重选定的损失函数,该选项通常比原始的验证数据要少得多,该选项以详尽的搜索战略加强。我们的稳定选择值变量广泛适用,但方便用户使用。此外,我们的稳定选择变式的变式可以避免一个严重的问题,因为这样的问题会影响我们最初的稳定性选择率的精确性,在高维度的模型中,而我们则会考虑一个选择一个稳定的预估的精确的排序,我们可以避免一个比一个我们的任何结果。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日