Variable selection in ultra-high dimensional linear regression is often preceded by a screening step to significantly reduce the dimension. Here a Bayesian variable screening method (BITS) is developed. BITS can successfully integrate prior knowledge, if any, on effect sizes, and the number of true variables. BITS iteratively includes potential variables with the highest posterior probability accounting for the already selected variables. It is implemented by a fast Cholesky update algorithm and is shown to have the screening consistency property. BITS is built based on a model with Gaussian errors, yet, the screening consistency is proved to hold under more general tail conditions. The notion of posterior screening consistency allows the resulting model to provide a good starting point for further Bayesian variable selection methods. A new screening consistent stopping rule based on posterior probability is developed. Simulation studies and real data examples are used to demonstrate scalability and fine screening performance.
翻译:在超高维线性回归中,变量选择通常先采取筛选步骤,以大幅缩小维度。这里开发了贝叶西亚变量筛选方法(BITS),BITS可以成功地整合关于影响大小和真实变量数量的先前知识(如果有的话)和真实变量的数量。BITS迭接地包含潜在变量,其后继概率最高,并计入了已经选定的变量。它通过快速的Cholesky更新算法实施,并显示其具有筛选一致性属性。BITS是根据高斯误差模型构建的,然而,筛选的一致性已被证明在更一般的尾误条件下得以维持。后端筛选一致性的概念使得由此产生的模型能够为进一步的Bayesian变量选择方法提供一个良好的起点。根据远端概率开发了一种新的连贯的筛选规则。模拟研究和真实数据示例用来显示可缩放性和精细度筛选性。