Predictor screening rules, which discard predictors from the design matrix before fitting a model, have had considerable impact on the speed with which l1-regularized regression problems, such as the lasso, can be solved. Current state-of-the-art screening rules, however, have difficulties in dealing with highly-correlated predictors, often becoming too conservative. In this paper, we present a new screening rule to deal with this issue: the Hessian Screening Rule. The rule uses second-order information from the model to provide more accurate screening as well as higher-quality warm starts. The proposed rule outperforms all studied alternatives on data sets with high correlation for both l1-regularized least-squares (the lasso) and logistic regression. It also performs best overall on the real data sets that we examine.
翻译:预测筛选规则在设计模型之前将预测器从设计矩阵中丢弃,对解决l1常规回归问题(如拉索)的速度产生了相当大的影响。 但是,目前最先进的筛选规则在处理高气压相关预测器方面遇到困难,往往变得过于保守。在本文件中,我们提出了一个新的筛选规则来处理这一问题:黑森筛选规则。规则使用模型的二阶信息来提供更准确的筛选以及更高质量的热源启动。拟议的规则优于所有研究过的替代品,而这些替代品对于l1常规最低方位(lasso)和物流回归都具有高度相关性。它还以我们所审查的真实数据集为整体表现得最好。