This paper revisits the knockoff-based multiple testing setup considered in Barber & Candes (2015) for variable selection applied to a linear regression model with $n\ge 2d$, where $n$ is the sample size and $d$ is the number of explanatory variables. The BH method based on ordinary least squares estimates of the regressions coefficients is adjusted to this setup, making it a valid $p$-value based FDR controlling method that does not rely on any specific correlation structure of the explanatory variables. Simulations and real data applications demonstrate that our proposed method in its original form and its data-adaptive version incorporating estimated proportion of truly unimportant explanatory variables are powerful competitors of the FDR controlling methods in Barber & Candes (2015).
翻译:本文重新审视了Barber & Candes(2015年)中考虑的、适用于以美元为样本大小,美元为解释变量数的线性回归模型可变选择的基于淘汰的多重测试设置。基于普通最低方位回归系数估算的波黑法对这一设置进行了调整,使其成为一种基于美元价值的有效基于FDR的控制方法,不依赖于解释变量的任何具体相关结构。模拟和真实数据应用表明,我们最初形式的拟议方法及其包含真正不重要解释变量估计比例的数据适应版本(2015年)是巴伯和坎德斯FDR控制方法的强大竞争者。