Sparse regression and classification estimators capable of group selection have application to an assortment of statistical problems, from multitask learning to sparse additive modeling to hierarchical selection. This work introduces a class of group-sparse estimators that combine group subset selection with group lasso or ridge shrinkage. We develop an optimization framework for fitting the nonconvex regularization surface and present finite-sample error bounds for estimation of the regression function. Our methods and analyses accommodate the general setting where groups overlap. As an application of group selection, we study sparse semiparametric modeling, a procedure that allows the effect of each predictor to be zero, linear, or nonlinear. For this task, the new estimators improve across several metrics on synthetic data compared to alternatives. Finally, we demonstrate their efficacy in modeling supermarket foot traffic and economic recessions using many predictors. All of our proposals are made available in the scalable implementation grpsel.
翻译:能够进行分组选择的粗化回归和分类估计符适用于从多任务学习到稀散添加型模型到等级选择等各种统计问题的分类。 这项工作引入了一组群体偏差估计符, 将组子选择与群 lasso 组合组合或脊脊缩缩缩组合起来。 我们开发了一个优化框架, 以适应非convex 正规化表面, 并展示用于估算回归函数的有限缩略错误。 我们的方法和分析包含组重叠的一般设置。 作为组选择的应用, 我们研究稀有的半参数模型, 这是一种允许每个预测符产生零、 线性或非线性效果的程序。 对于这项任务, 新的估计符在合成数据与替代品相比的多个指标上有所改进。 最后, 我们用许多预测器来展示超市脚流量和经济衰退模型的功效。 我们的所有建议都可以在可缩放的执行格中找到。