Predictor screening rules, which discard predictors before fitting a model, have had considerable impact on the speed with which sparse regression problems, such as the lasso, can be solved. In this paper we present a new screening rule for solving the lasso path: the Hessian Screening Rule. The rule uses second-order information from the model to provide both effective screening, particularly in the case of high correlation, as well as accurate warm starts. The proposed rule outperforms all alternatives we study on simulated data sets with both low and high correlation for $\ell_1$-regularized least-squares (the lasso) and logistic regression. It also performs best in general on the real data sets that we examine.
翻译:预言筛选规则在安装模型之前就抛弃了预测器,对诸如拉索等稀有回归问题能够解决的速度产生了相当大的影响。 在本文中,我们提出了一个解决拉索路径的新筛选规则:黑森筛选规则。 规则使用模型的二级信息来提供有效的筛选, 特别是在高度相关的情况下, 以及准确的热点启动。 拟议的规则优于我们所研究的模拟数据集的所有替代数据组,这些模拟数据集与美元/ ell_ 1美元正规化最低方块(lasso)和物流回归具有低和高的关联性。 规则还以我们所研究的真实数据集为主, 总体上表现最佳。