High-dimensional data sets have become ubiquitous in the past few decades, often with many more covariates than observations. In the frequentist setting, penalized likelihood methods are the most popular approach for variable selection and estimation in high-dimensional data. In the Bayesian framework, spike-and-slab methods are commonly used as probabilistic constructs for high-dimensional modeling. Within the context of linear regression, Rockova and George (2018) introduced the spike-and-slab LASSO (SSL), an approach based on a prior which provides a continuum between the penalized likelihood LASSO and the Bayesian point-mass spike-and-slab formulations. Since its inception, the spike-and-slab LASSO has been extended to a variety of contexts, including generalized linear models, factor analysis, graphical models, and nonparametric regression. The goal of this paper is to survey the landscape surrounding spike-and-slab LASSO methodology. First we elucidate the attractive properties and the computational tractability of SSL priors in high dimensions. We then review methodological developments of the SSL and outline several theoretical developments. We illustrate the methodology on both simulated and real datasets.
翻译:在过去几十年里,高维数据集已变得无处不在,往往比观测量多得多。在经常情况下,受限可能性方法是高维数据变量选择和估计的最流行方法。在巴伊西亚框架内,钉和板通常用作高维模型的概率模型。在线性回归的背景下,洛克沃瓦和乔治(2018年)引入了钉和悬浮LASSO(SSL)方法(SSL),这一方法基于以前的一种方法,它提供了LASSO和Bayesian点质量钉钉和板块配方之间受罚可能性的连续性。自开始以来,钉和板LASSOS系统已扩展至各种环境,包括通用线性模型、要素分析、图形模型和非参数回归。本文的目的是调查钉和悬浮LASSSO方法的景观。首先我们阐述了SLSL先前高维度的吸引力特性和计算可度。我们随后审视了SLSLS方法的发展方法,并概述了SLSLS的模拟和数项理论性发展。